CMSIS DSP Library from CMSIS 2.0. See http://www.onarm.com/cmsis/ for full details
Dependents: K22F_DSP_Matrix_least_square BNO055-ELEC3810 1BNO055 ECE4180Project--Slave2 ... more
src/Cortex-M4-M3/FilteringFunctions/arm_fir_sparse_f32.c@0:1014af42efd9, 2011-03-10 (annotated)
- Committer:
- simon
- Date:
- Thu Mar 10 15:07:50 2011 +0000
- Revision:
- 0:1014af42efd9
Who changed what in which revision?
User | Revision | Line number | New contents of line |
---|---|---|---|
simon | 0:1014af42efd9 | 1 | /* ---------------------------------------------------------------------- |
simon | 0:1014af42efd9 | 2 | * Copyright (C) 2010 ARM Limited. All rights reserved. |
simon | 0:1014af42efd9 | 3 | * |
simon | 0:1014af42efd9 | 4 | * $Date: 29. November 2010 |
simon | 0:1014af42efd9 | 5 | * $Revision: V1.0.3 |
simon | 0:1014af42efd9 | 6 | * |
simon | 0:1014af42efd9 | 7 | * Project: CMSIS DSP Library |
simon | 0:1014af42efd9 | 8 | * Title: arm_fir_sparse_f32.c |
simon | 0:1014af42efd9 | 9 | * |
simon | 0:1014af42efd9 | 10 | * Description: Floating-point sparse FIR filter processing function. |
simon | 0:1014af42efd9 | 11 | * |
simon | 0:1014af42efd9 | 12 | * Target Processor: Cortex-M4/Cortex-M3 |
simon | 0:1014af42efd9 | 13 | * |
simon | 0:1014af42efd9 | 14 | * Version 1.0.3 2010/11/29 |
simon | 0:1014af42efd9 | 15 | * Re-organized the CMSIS folders and updated documentation. |
simon | 0:1014af42efd9 | 16 | * |
simon | 0:1014af42efd9 | 17 | * Version 1.0.2 2010/11/11 |
simon | 0:1014af42efd9 | 18 | * Documentation updated. |
simon | 0:1014af42efd9 | 19 | * |
simon | 0:1014af42efd9 | 20 | * Version 1.0.1 2010/10/05 |
simon | 0:1014af42efd9 | 21 | * Production release and review comments incorporated. |
simon | 0:1014af42efd9 | 22 | * |
simon | 0:1014af42efd9 | 23 | * Version 1.0.0 2010/09/20 |
simon | 0:1014af42efd9 | 24 | * Production release and review comments incorporated |
simon | 0:1014af42efd9 | 25 | * |
simon | 0:1014af42efd9 | 26 | * Version 0.0.7 2010/06/10 |
simon | 0:1014af42efd9 | 27 | * Misra-C changes done |
simon | 0:1014af42efd9 | 28 | * ------------------------------------------------------------------- */ |
simon | 0:1014af42efd9 | 29 | #include "arm_math.h" |
simon | 0:1014af42efd9 | 30 | |
simon | 0:1014af42efd9 | 31 | /** |
simon | 0:1014af42efd9 | 32 | * @ingroup groupFilters |
simon | 0:1014af42efd9 | 33 | */ |
simon | 0:1014af42efd9 | 34 | |
simon | 0:1014af42efd9 | 35 | /** |
simon | 0:1014af42efd9 | 36 | * @defgroup FIR_Sparse Finite Impulse Response (FIR) Sparse Filters |
simon | 0:1014af42efd9 | 37 | * |
simon | 0:1014af42efd9 | 38 | * This group of functions implements sparse FIR filters. |
simon | 0:1014af42efd9 | 39 | * Sparse FIR filters are equivalent to standard FIR filters except that most of the coefficients are equal to zero. |
simon | 0:1014af42efd9 | 40 | * Sparse filters are used for simulating reflections in communications and audio applications. |
simon | 0:1014af42efd9 | 41 | * |
simon | 0:1014af42efd9 | 42 | * There are separate functions for Q7, Q15, Q31, and floating-point data types. |
simon | 0:1014af42efd9 | 43 | * The functions operate on blocks of input and output data and each call to the function processes |
simon | 0:1014af42efd9 | 44 | * <code>blockSize</code> samples through the filter. <code>pSrc</code> and |
simon | 0:1014af42efd9 | 45 | * <code>pDst</code> points to input and output arrays respectively containing <code>blockSize</code> values. |
simon | 0:1014af42efd9 | 46 | * |
simon | 0:1014af42efd9 | 47 | * \par Algorithm: |
simon | 0:1014af42efd9 | 48 | * The sparse filter instant structure contains an array of tap indices <code>pTapDelay</code> which specifies the locations of the non-zero coefficients. |
simon | 0:1014af42efd9 | 49 | * This is in addition to the coefficient array <code>b</code>. |
simon | 0:1014af42efd9 | 50 | * The implementation essentially skips the multiplications by zero and leads to an efficient realization. |
simon | 0:1014af42efd9 | 51 | * <pre> |
simon | 0:1014af42efd9 | 52 | * y[n] = b[0] * x[n-pTapDelay[0]] + b[1] * x[n-pTapDelay[1]] + b[2] * x[n-pTapDelay[2]] + ...+ b[numTaps-1] * x[n-pTapDelay[numTaps-1]] |
simon | 0:1014af42efd9 | 53 | * </pre> |
simon | 0:1014af42efd9 | 54 | * \par |
simon | 0:1014af42efd9 | 55 | * \image html FIRSparse.gif "Sparse FIR filter. b[n] represents the filter coefficients" |
simon | 0:1014af42efd9 | 56 | * \par |
simon | 0:1014af42efd9 | 57 | * <code>pCoeffs</code> points to a coefficient array of size <code>numTaps</code>; |
simon | 0:1014af42efd9 | 58 | * <code>pTapDelay</code> points to an array of nonzero indices and is also of size <code>numTaps</code>; |
simon | 0:1014af42efd9 | 59 | * <code>pState</code> points to a state array of size <code>maxDelay + blockSize</code>, where |
simon | 0:1014af42efd9 | 60 | * <code>maxDelay</code> is the largest offset value that is ever used in the <code>pTapDelay</code> array. |
simon | 0:1014af42efd9 | 61 | * Some of the processing functions also require temporary working buffers. |
simon | 0:1014af42efd9 | 62 | * |
simon | 0:1014af42efd9 | 63 | * \par Instance Structure |
simon | 0:1014af42efd9 | 64 | * The coefficients and state variables for a filter are stored together in an instance data structure. |
simon | 0:1014af42efd9 | 65 | * A separate instance structure must be defined for each filter. |
simon | 0:1014af42efd9 | 66 | * Coefficient and offset arrays may be shared among several instances while state variable arrays cannot be shared. |
simon | 0:1014af42efd9 | 67 | * There are separate instance structure declarations for each of the 4 supported data types. |
simon | 0:1014af42efd9 | 68 | * |
simon | 0:1014af42efd9 | 69 | * \par Initialization Functions |
simon | 0:1014af42efd9 | 70 | * There is also an associated initialization function for each data type. |
simon | 0:1014af42efd9 | 71 | * The initialization function performs the following operations: |
simon | 0:1014af42efd9 | 72 | * - Sets the values of the internal structure fields. |
simon | 0:1014af42efd9 | 73 | * - Zeros out the values in the state buffer. |
simon | 0:1014af42efd9 | 74 | * |
simon | 0:1014af42efd9 | 75 | * \par |
simon | 0:1014af42efd9 | 76 | * Use of the initialization function is optional. |
simon | 0:1014af42efd9 | 77 | * However, if the initialization function is used, then the instance structure cannot be placed into a const data section. |
simon | 0:1014af42efd9 | 78 | * To place an instance structure into a const data section, the instance structure must be manually initialized. |
simon | 0:1014af42efd9 | 79 | * Set the values in the state buffer to zeros before static initialization. |
simon | 0:1014af42efd9 | 80 | * The code below statically initializes each of the 4 different data type filter instance structures |
simon | 0:1014af42efd9 | 81 | * <pre> |
simon | 0:1014af42efd9 | 82 | *arm_fir_sparse_instance_f32 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay}; |
simon | 0:1014af42efd9 | 83 | *arm_fir_sparse_instance_q31 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay}; |
simon | 0:1014af42efd9 | 84 | *arm_fir_sparse_instance_q15 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay}; |
simon | 0:1014af42efd9 | 85 | *arm_fir_sparse_instance_q7 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay}; |
simon | 0:1014af42efd9 | 86 | * </pre> |
simon | 0:1014af42efd9 | 87 | * \par |
simon | 0:1014af42efd9 | 88 | * |
simon | 0:1014af42efd9 | 89 | * \par Fixed-Point Behavior |
simon | 0:1014af42efd9 | 90 | * Care must be taken when using the fixed-point versions of the sparse FIR filter functions. |
simon | 0:1014af42efd9 | 91 | * In particular, the overflow and saturation behavior of the accumulator used in each function must be considered. |
simon | 0:1014af42efd9 | 92 | * Refer to the function specific documentation below for usage guidelines. |
simon | 0:1014af42efd9 | 93 | */ |
simon | 0:1014af42efd9 | 94 | |
simon | 0:1014af42efd9 | 95 | /** |
simon | 0:1014af42efd9 | 96 | * @addtogroup FIR_Sparse |
simon | 0:1014af42efd9 | 97 | * @{ |
simon | 0:1014af42efd9 | 98 | */ |
simon | 0:1014af42efd9 | 99 | |
simon | 0:1014af42efd9 | 100 | /** |
simon | 0:1014af42efd9 | 101 | * @brief Processing function for the floating-point sparse FIR filter. |
simon | 0:1014af42efd9 | 102 | * @param[in] *S points to an instance of the floating-point sparse FIR structure. |
simon | 0:1014af42efd9 | 103 | * @param[in] *pSrc points to the block of input data. |
simon | 0:1014af42efd9 | 104 | * @param[out] *pDst points to the block of output data |
simon | 0:1014af42efd9 | 105 | * @param[in] *pScratchIn points to a temporary buffer of size blockSize. |
simon | 0:1014af42efd9 | 106 | * @param[in] blockSize number of input samples to process per call. |
simon | 0:1014af42efd9 | 107 | * @return none. |
simon | 0:1014af42efd9 | 108 | */ |
simon | 0:1014af42efd9 | 109 | |
simon | 0:1014af42efd9 | 110 | void arm_fir_sparse_f32( |
simon | 0:1014af42efd9 | 111 | arm_fir_sparse_instance_f32 * S, |
simon | 0:1014af42efd9 | 112 | float32_t * pSrc, |
simon | 0:1014af42efd9 | 113 | float32_t * pDst, |
simon | 0:1014af42efd9 | 114 | float32_t * pScratchIn, |
simon | 0:1014af42efd9 | 115 | uint32_t blockSize) |
simon | 0:1014af42efd9 | 116 | { |
simon | 0:1014af42efd9 | 117 | |
simon | 0:1014af42efd9 | 118 | float32_t *pState = S->pState; /* State pointer */ |
simon | 0:1014af42efd9 | 119 | float32_t *pCoeffs = S->pCoeffs; /* Coefficient pointer */ |
simon | 0:1014af42efd9 | 120 | float32_t *px; /* Scratch buffer pointer */ |
simon | 0:1014af42efd9 | 121 | float32_t *py = pState; /* Temporary pointers for state buffer */ |
simon | 0:1014af42efd9 | 122 | float32_t *pb = pScratchIn; /* Temporary pointers for scratch buffer */ |
simon | 0:1014af42efd9 | 123 | float32_t *pOut; /* Destination pointer */ |
simon | 0:1014af42efd9 | 124 | int32_t *pTapDelay = S->pTapDelay; /* Pointer to the array containing offset of the non-zero tap values. */ |
simon | 0:1014af42efd9 | 125 | uint32_t delaySize = S->maxDelay + blockSize; /* state length */ |
simon | 0:1014af42efd9 | 126 | uint16_t numTaps = S->numTaps; /* Number of filter coefficients in the filter */ |
simon | 0:1014af42efd9 | 127 | int32_t readIndex; /* Read index of the state buffer */ |
simon | 0:1014af42efd9 | 128 | uint32_t tapCnt, blkCnt; /* loop counters */ |
simon | 0:1014af42efd9 | 129 | float32_t coeff = *pCoeffs++; /* Read the first coefficient value */ |
simon | 0:1014af42efd9 | 130 | |
simon | 0:1014af42efd9 | 131 | |
simon | 0:1014af42efd9 | 132 | |
simon | 0:1014af42efd9 | 133 | /* BlockSize of Input samples are copied into the state buffer */ |
simon | 0:1014af42efd9 | 134 | /* StateIndex points to the starting position to write in the state buffer */ |
simon | 0:1014af42efd9 | 135 | arm_circularWrite_f32((int32_t *) py, delaySize, &S->stateIndex, 1, |
simon | 0:1014af42efd9 | 136 | (int32_t *) pSrc, 1, blockSize); |
simon | 0:1014af42efd9 | 137 | |
simon | 0:1014af42efd9 | 138 | |
simon | 0:1014af42efd9 | 139 | /* Read Index, from where the state buffer should be read, is calculated. */ |
simon | 0:1014af42efd9 | 140 | readIndex = ((int32_t) S->stateIndex - (int32_t) blockSize) - *pTapDelay++; |
simon | 0:1014af42efd9 | 141 | |
simon | 0:1014af42efd9 | 142 | /* Wraparound of readIndex */ |
simon | 0:1014af42efd9 | 143 | if(readIndex < 0) |
simon | 0:1014af42efd9 | 144 | { |
simon | 0:1014af42efd9 | 145 | readIndex += (int32_t) delaySize; |
simon | 0:1014af42efd9 | 146 | } |
simon | 0:1014af42efd9 | 147 | |
simon | 0:1014af42efd9 | 148 | /* Working pointer for state buffer is updated */ |
simon | 0:1014af42efd9 | 149 | py = pState; |
simon | 0:1014af42efd9 | 150 | |
simon | 0:1014af42efd9 | 151 | /* blockSize samples are read from the state buffer */ |
simon | 0:1014af42efd9 | 152 | arm_circularRead_f32((int32_t *) py, delaySize, &readIndex, 1, |
simon | 0:1014af42efd9 | 153 | (int32_t *) pb, (int32_t *) pb, blockSize, 1, |
simon | 0:1014af42efd9 | 154 | blockSize); |
simon | 0:1014af42efd9 | 155 | |
simon | 0:1014af42efd9 | 156 | /* Working pointer for the scratch buffer */ |
simon | 0:1014af42efd9 | 157 | px = pb; |
simon | 0:1014af42efd9 | 158 | |
simon | 0:1014af42efd9 | 159 | /* Working pointer for destination buffer */ |
simon | 0:1014af42efd9 | 160 | pOut = pDst; |
simon | 0:1014af42efd9 | 161 | |
simon | 0:1014af42efd9 | 162 | /* Loop over the blockSize. Unroll by a factor of 4. |
simon | 0:1014af42efd9 | 163 | * Compute 4 Multiplications at a time. */ |
simon | 0:1014af42efd9 | 164 | blkCnt = blockSize >> 2u; |
simon | 0:1014af42efd9 | 165 | |
simon | 0:1014af42efd9 | 166 | while(blkCnt > 0u) |
simon | 0:1014af42efd9 | 167 | { |
simon | 0:1014af42efd9 | 168 | /* Perform Multiplications and store in destination buffer */ |
simon | 0:1014af42efd9 | 169 | *pOut++ = *px++ * coeff; |
simon | 0:1014af42efd9 | 170 | *pOut++ = *px++ * coeff; |
simon | 0:1014af42efd9 | 171 | *pOut++ = *px++ * coeff; |
simon | 0:1014af42efd9 | 172 | *pOut++ = *px++ * coeff; |
simon | 0:1014af42efd9 | 173 | |
simon | 0:1014af42efd9 | 174 | /* Decrement the loop counter */ |
simon | 0:1014af42efd9 | 175 | blkCnt--; |
simon | 0:1014af42efd9 | 176 | } |
simon | 0:1014af42efd9 | 177 | |
simon | 0:1014af42efd9 | 178 | /* If the blockSize is not a multiple of 4, |
simon | 0:1014af42efd9 | 179 | * compute the remaining samples */ |
simon | 0:1014af42efd9 | 180 | blkCnt = blockSize % 0x4u; |
simon | 0:1014af42efd9 | 181 | |
simon | 0:1014af42efd9 | 182 | while(blkCnt > 0u) |
simon | 0:1014af42efd9 | 183 | { |
simon | 0:1014af42efd9 | 184 | /* Perform Multiplications and store in destination buffer */ |
simon | 0:1014af42efd9 | 185 | *pOut++ = *px++ * coeff; |
simon | 0:1014af42efd9 | 186 | |
simon | 0:1014af42efd9 | 187 | /* Decrement the loop counter */ |
simon | 0:1014af42efd9 | 188 | blkCnt--; |
simon | 0:1014af42efd9 | 189 | } |
simon | 0:1014af42efd9 | 190 | |
simon | 0:1014af42efd9 | 191 | /* Load the coefficient value and |
simon | 0:1014af42efd9 | 192 | * increment the coefficient buffer for the next set of state values */ |
simon | 0:1014af42efd9 | 193 | coeff = *pCoeffs++; |
simon | 0:1014af42efd9 | 194 | |
simon | 0:1014af42efd9 | 195 | /* Read Index, from where the state buffer should be read, is calculated. */ |
simon | 0:1014af42efd9 | 196 | readIndex = ((int32_t) S->stateIndex - (int32_t) blockSize) - *pTapDelay++; |
simon | 0:1014af42efd9 | 197 | |
simon | 0:1014af42efd9 | 198 | /* Wraparound of readIndex */ |
simon | 0:1014af42efd9 | 199 | if(readIndex < 0) |
simon | 0:1014af42efd9 | 200 | { |
simon | 0:1014af42efd9 | 201 | readIndex += (int32_t) delaySize; |
simon | 0:1014af42efd9 | 202 | } |
simon | 0:1014af42efd9 | 203 | |
simon | 0:1014af42efd9 | 204 | /* Loop over the number of taps. */ |
simon | 0:1014af42efd9 | 205 | tapCnt = (uint32_t) numTaps - 1u; |
simon | 0:1014af42efd9 | 206 | |
simon | 0:1014af42efd9 | 207 | while(tapCnt > 0u) |
simon | 0:1014af42efd9 | 208 | { |
simon | 0:1014af42efd9 | 209 | |
simon | 0:1014af42efd9 | 210 | /* Working pointer for state buffer is updated */ |
simon | 0:1014af42efd9 | 211 | py = pState; |
simon | 0:1014af42efd9 | 212 | |
simon | 0:1014af42efd9 | 213 | /* blockSize samples are read from the state buffer */ |
simon | 0:1014af42efd9 | 214 | arm_circularRead_f32((int32_t *) py, delaySize, &readIndex, 1, |
simon | 0:1014af42efd9 | 215 | (int32_t *) pb, (int32_t *) pb, blockSize, 1, |
simon | 0:1014af42efd9 | 216 | blockSize); |
simon | 0:1014af42efd9 | 217 | |
simon | 0:1014af42efd9 | 218 | /* Working pointer for the scratch buffer */ |
simon | 0:1014af42efd9 | 219 | px = pb; |
simon | 0:1014af42efd9 | 220 | |
simon | 0:1014af42efd9 | 221 | /* Working pointer for destination buffer */ |
simon | 0:1014af42efd9 | 222 | pOut = pDst; |
simon | 0:1014af42efd9 | 223 | |
simon | 0:1014af42efd9 | 224 | /* Loop over the blockSize. Unroll by a factor of 4. |
simon | 0:1014af42efd9 | 225 | * Compute 4 MACS at a time. */ |
simon | 0:1014af42efd9 | 226 | blkCnt = blockSize >> 2u; |
simon | 0:1014af42efd9 | 227 | |
simon | 0:1014af42efd9 | 228 | while(blkCnt > 0u) |
simon | 0:1014af42efd9 | 229 | { |
simon | 0:1014af42efd9 | 230 | /* Perform Multiply-Accumulate */ |
simon | 0:1014af42efd9 | 231 | *pOut++ += *px++ * coeff; |
simon | 0:1014af42efd9 | 232 | *pOut++ += *px++ * coeff; |
simon | 0:1014af42efd9 | 233 | *pOut++ += *px++ * coeff; |
simon | 0:1014af42efd9 | 234 | *pOut++ += *px++ * coeff; |
simon | 0:1014af42efd9 | 235 | |
simon | 0:1014af42efd9 | 236 | /* Decrement the loop counter */ |
simon | 0:1014af42efd9 | 237 | blkCnt--; |
simon | 0:1014af42efd9 | 238 | } |
simon | 0:1014af42efd9 | 239 | |
simon | 0:1014af42efd9 | 240 | /* If the blockSize is not a multiple of 4, |
simon | 0:1014af42efd9 | 241 | * compute the remaining samples */ |
simon | 0:1014af42efd9 | 242 | blkCnt = blockSize % 0x4u; |
simon | 0:1014af42efd9 | 243 | |
simon | 0:1014af42efd9 | 244 | while(blkCnt > 0u) |
simon | 0:1014af42efd9 | 245 | { |
simon | 0:1014af42efd9 | 246 | /* Perform Multiply-Accumulate */ |
simon | 0:1014af42efd9 | 247 | *pOut++ += *px++ * coeff; |
simon | 0:1014af42efd9 | 248 | |
simon | 0:1014af42efd9 | 249 | /* Decrement the loop counter */ |
simon | 0:1014af42efd9 | 250 | blkCnt--; |
simon | 0:1014af42efd9 | 251 | } |
simon | 0:1014af42efd9 | 252 | |
simon | 0:1014af42efd9 | 253 | /* Load the coefficient value and |
simon | 0:1014af42efd9 | 254 | * increment the coefficient buffer for the next set of state values */ |
simon | 0:1014af42efd9 | 255 | coeff = *pCoeffs++; |
simon | 0:1014af42efd9 | 256 | |
simon | 0:1014af42efd9 | 257 | /* Read Index, from where the state buffer should be read, is calculated. */ |
simon | 0:1014af42efd9 | 258 | readIndex = ((int32_t) S->stateIndex - |
simon | 0:1014af42efd9 | 259 | (int32_t) blockSize) - *pTapDelay++; |
simon | 0:1014af42efd9 | 260 | |
simon | 0:1014af42efd9 | 261 | /* Wraparound of readIndex */ |
simon | 0:1014af42efd9 | 262 | if(readIndex < 0) |
simon | 0:1014af42efd9 | 263 | { |
simon | 0:1014af42efd9 | 264 | readIndex += (int32_t) delaySize; |
simon | 0:1014af42efd9 | 265 | } |
simon | 0:1014af42efd9 | 266 | |
simon | 0:1014af42efd9 | 267 | /* Decrement the tap loop counter */ |
simon | 0:1014af42efd9 | 268 | tapCnt--; |
simon | 0:1014af42efd9 | 269 | } |
simon | 0:1014af42efd9 | 270 | |
simon | 0:1014af42efd9 | 271 | } |
simon | 0:1014af42efd9 | 272 | |
simon | 0:1014af42efd9 | 273 | /** |
simon | 0:1014af42efd9 | 274 | * @} end of FIR_Sparse group |
simon | 0:1014af42efd9 | 275 | */ |