CMSIS DSP Library from CMSIS 2.0. See http://www.onarm.com/cmsis/ for full details

Dependents:   K22F_DSP_Matrix_least_square BNO055-ELEC3810 1BNO055 ECE4180Project--Slave2 ... more

Committer:
simon
Date:
Thu Mar 10 15:07:50 2011 +0000
Revision:
0:1014af42efd9

        

Who changed what in which revision?

UserRevisionLine numberNew contents of line
simon 0:1014af42efd9 1 /* ----------------------------------------------------------------------
simon 0:1014af42efd9 2 * Copyright (C) 2010 ARM Limited. All rights reserved.
simon 0:1014af42efd9 3 *
simon 0:1014af42efd9 4 * $Date: 29. November 2010
simon 0:1014af42efd9 5 * $Revision: V1.0.3
simon 0:1014af42efd9 6 *
simon 0:1014af42efd9 7 * Project: CMSIS DSP Library
simon 0:1014af42efd9 8 * Title: arm_fir_sparse_f32.c
simon 0:1014af42efd9 9 *
simon 0:1014af42efd9 10 * Description: Floating-point sparse FIR filter processing function.
simon 0:1014af42efd9 11 *
simon 0:1014af42efd9 12 * Target Processor: Cortex-M4/Cortex-M3
simon 0:1014af42efd9 13 *
simon 0:1014af42efd9 14 * Version 1.0.3 2010/11/29
simon 0:1014af42efd9 15 * Re-organized the CMSIS folders and updated documentation.
simon 0:1014af42efd9 16 *
simon 0:1014af42efd9 17 * Version 1.0.2 2010/11/11
simon 0:1014af42efd9 18 * Documentation updated.
simon 0:1014af42efd9 19 *
simon 0:1014af42efd9 20 * Version 1.0.1 2010/10/05
simon 0:1014af42efd9 21 * Production release and review comments incorporated.
simon 0:1014af42efd9 22 *
simon 0:1014af42efd9 23 * Version 1.0.0 2010/09/20
simon 0:1014af42efd9 24 * Production release and review comments incorporated
simon 0:1014af42efd9 25 *
simon 0:1014af42efd9 26 * Version 0.0.7 2010/06/10
simon 0:1014af42efd9 27 * Misra-C changes done
simon 0:1014af42efd9 28 * ------------------------------------------------------------------- */
simon 0:1014af42efd9 29 #include "arm_math.h"
simon 0:1014af42efd9 30
simon 0:1014af42efd9 31 /**
simon 0:1014af42efd9 32 * @ingroup groupFilters
simon 0:1014af42efd9 33 */
simon 0:1014af42efd9 34
simon 0:1014af42efd9 35 /**
simon 0:1014af42efd9 36 * @defgroup FIR_Sparse Finite Impulse Response (FIR) Sparse Filters
simon 0:1014af42efd9 37 *
simon 0:1014af42efd9 38 * This group of functions implements sparse FIR filters.
simon 0:1014af42efd9 39 * Sparse FIR filters are equivalent to standard FIR filters except that most of the coefficients are equal to zero.
simon 0:1014af42efd9 40 * Sparse filters are used for simulating reflections in communications and audio applications.
simon 0:1014af42efd9 41 *
simon 0:1014af42efd9 42 * There are separate functions for Q7, Q15, Q31, and floating-point data types.
simon 0:1014af42efd9 43 * The functions operate on blocks of input and output data and each call to the function processes
simon 0:1014af42efd9 44 * <code>blockSize</code> samples through the filter. <code>pSrc</code> and
simon 0:1014af42efd9 45 * <code>pDst</code> points to input and output arrays respectively containing <code>blockSize</code> values.
simon 0:1014af42efd9 46 *
simon 0:1014af42efd9 47 * \par Algorithm:
simon 0:1014af42efd9 48 * The sparse filter instant structure contains an array of tap indices <code>pTapDelay</code> which specifies the locations of the non-zero coefficients.
simon 0:1014af42efd9 49 * This is in addition to the coefficient array <code>b</code>.
simon 0:1014af42efd9 50 * The implementation essentially skips the multiplications by zero and leads to an efficient realization.
simon 0:1014af42efd9 51 * <pre>
simon 0:1014af42efd9 52 * y[n] = b[0] * x[n-pTapDelay[0]] + b[1] * x[n-pTapDelay[1]] + b[2] * x[n-pTapDelay[2]] + ...+ b[numTaps-1] * x[n-pTapDelay[numTaps-1]]
simon 0:1014af42efd9 53 * </pre>
simon 0:1014af42efd9 54 * \par
simon 0:1014af42efd9 55 * \image html FIRSparse.gif "Sparse FIR filter. b[n] represents the filter coefficients"
simon 0:1014af42efd9 56 * \par
simon 0:1014af42efd9 57 * <code>pCoeffs</code> points to a coefficient array of size <code>numTaps</code>;
simon 0:1014af42efd9 58 * <code>pTapDelay</code> points to an array of nonzero indices and is also of size <code>numTaps</code>;
simon 0:1014af42efd9 59 * <code>pState</code> points to a state array of size <code>maxDelay + blockSize</code>, where
simon 0:1014af42efd9 60 * <code>maxDelay</code> is the largest offset value that is ever used in the <code>pTapDelay</code> array.
simon 0:1014af42efd9 61 * Some of the processing functions also require temporary working buffers.
simon 0:1014af42efd9 62 *
simon 0:1014af42efd9 63 * \par Instance Structure
simon 0:1014af42efd9 64 * The coefficients and state variables for a filter are stored together in an instance data structure.
simon 0:1014af42efd9 65 * A separate instance structure must be defined for each filter.
simon 0:1014af42efd9 66 * Coefficient and offset arrays may be shared among several instances while state variable arrays cannot be shared.
simon 0:1014af42efd9 67 * There are separate instance structure declarations for each of the 4 supported data types.
simon 0:1014af42efd9 68 *
simon 0:1014af42efd9 69 * \par Initialization Functions
simon 0:1014af42efd9 70 * There is also an associated initialization function for each data type.
simon 0:1014af42efd9 71 * The initialization function performs the following operations:
simon 0:1014af42efd9 72 * - Sets the values of the internal structure fields.
simon 0:1014af42efd9 73 * - Zeros out the values in the state buffer.
simon 0:1014af42efd9 74 *
simon 0:1014af42efd9 75 * \par
simon 0:1014af42efd9 76 * Use of the initialization function is optional.
simon 0:1014af42efd9 77 * However, if the initialization function is used, then the instance structure cannot be placed into a const data section.
simon 0:1014af42efd9 78 * To place an instance structure into a const data section, the instance structure must be manually initialized.
simon 0:1014af42efd9 79 * Set the values in the state buffer to zeros before static initialization.
simon 0:1014af42efd9 80 * The code below statically initializes each of the 4 different data type filter instance structures
simon 0:1014af42efd9 81 * <pre>
simon 0:1014af42efd9 82 *arm_fir_sparse_instance_f32 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};
simon 0:1014af42efd9 83 *arm_fir_sparse_instance_q31 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};
simon 0:1014af42efd9 84 *arm_fir_sparse_instance_q15 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};
simon 0:1014af42efd9 85 *arm_fir_sparse_instance_q7 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};
simon 0:1014af42efd9 86 * </pre>
simon 0:1014af42efd9 87 * \par
simon 0:1014af42efd9 88 *
simon 0:1014af42efd9 89 * \par Fixed-Point Behavior
simon 0:1014af42efd9 90 * Care must be taken when using the fixed-point versions of the sparse FIR filter functions.
simon 0:1014af42efd9 91 * In particular, the overflow and saturation behavior of the accumulator used in each function must be considered.
simon 0:1014af42efd9 92 * Refer to the function specific documentation below for usage guidelines.
simon 0:1014af42efd9 93 */
simon 0:1014af42efd9 94
simon 0:1014af42efd9 95 /**
simon 0:1014af42efd9 96 * @addtogroup FIR_Sparse
simon 0:1014af42efd9 97 * @{
simon 0:1014af42efd9 98 */
simon 0:1014af42efd9 99
simon 0:1014af42efd9 100 /**
simon 0:1014af42efd9 101 * @brief Processing function for the floating-point sparse FIR filter.
simon 0:1014af42efd9 102 * @param[in] *S points to an instance of the floating-point sparse FIR structure.
simon 0:1014af42efd9 103 * @param[in] *pSrc points to the block of input data.
simon 0:1014af42efd9 104 * @param[out] *pDst points to the block of output data
simon 0:1014af42efd9 105 * @param[in] *pScratchIn points to a temporary buffer of size blockSize.
simon 0:1014af42efd9 106 * @param[in] blockSize number of input samples to process per call.
simon 0:1014af42efd9 107 * @return none.
simon 0:1014af42efd9 108 */
simon 0:1014af42efd9 109
simon 0:1014af42efd9 110 void arm_fir_sparse_f32(
simon 0:1014af42efd9 111 arm_fir_sparse_instance_f32 * S,
simon 0:1014af42efd9 112 float32_t * pSrc,
simon 0:1014af42efd9 113 float32_t * pDst,
simon 0:1014af42efd9 114 float32_t * pScratchIn,
simon 0:1014af42efd9 115 uint32_t blockSize)
simon 0:1014af42efd9 116 {
simon 0:1014af42efd9 117
simon 0:1014af42efd9 118 float32_t *pState = S->pState; /* State pointer */
simon 0:1014af42efd9 119 float32_t *pCoeffs = S->pCoeffs; /* Coefficient pointer */
simon 0:1014af42efd9 120 float32_t *px; /* Scratch buffer pointer */
simon 0:1014af42efd9 121 float32_t *py = pState; /* Temporary pointers for state buffer */
simon 0:1014af42efd9 122 float32_t *pb = pScratchIn; /* Temporary pointers for scratch buffer */
simon 0:1014af42efd9 123 float32_t *pOut; /* Destination pointer */
simon 0:1014af42efd9 124 int32_t *pTapDelay = S->pTapDelay; /* Pointer to the array containing offset of the non-zero tap values. */
simon 0:1014af42efd9 125 uint32_t delaySize = S->maxDelay + blockSize; /* state length */
simon 0:1014af42efd9 126 uint16_t numTaps = S->numTaps; /* Number of filter coefficients in the filter */
simon 0:1014af42efd9 127 int32_t readIndex; /* Read index of the state buffer */
simon 0:1014af42efd9 128 uint32_t tapCnt, blkCnt; /* loop counters */
simon 0:1014af42efd9 129 float32_t coeff = *pCoeffs++; /* Read the first coefficient value */
simon 0:1014af42efd9 130
simon 0:1014af42efd9 131
simon 0:1014af42efd9 132
simon 0:1014af42efd9 133 /* BlockSize of Input samples are copied into the state buffer */
simon 0:1014af42efd9 134 /* StateIndex points to the starting position to write in the state buffer */
simon 0:1014af42efd9 135 arm_circularWrite_f32((int32_t *) py, delaySize, &S->stateIndex, 1,
simon 0:1014af42efd9 136 (int32_t *) pSrc, 1, blockSize);
simon 0:1014af42efd9 137
simon 0:1014af42efd9 138
simon 0:1014af42efd9 139 /* Read Index, from where the state buffer should be read, is calculated. */
simon 0:1014af42efd9 140 readIndex = ((int32_t) S->stateIndex - (int32_t) blockSize) - *pTapDelay++;
simon 0:1014af42efd9 141
simon 0:1014af42efd9 142 /* Wraparound of readIndex */
simon 0:1014af42efd9 143 if(readIndex < 0)
simon 0:1014af42efd9 144 {
simon 0:1014af42efd9 145 readIndex += (int32_t) delaySize;
simon 0:1014af42efd9 146 }
simon 0:1014af42efd9 147
simon 0:1014af42efd9 148 /* Working pointer for state buffer is updated */
simon 0:1014af42efd9 149 py = pState;
simon 0:1014af42efd9 150
simon 0:1014af42efd9 151 /* blockSize samples are read from the state buffer */
simon 0:1014af42efd9 152 arm_circularRead_f32((int32_t *) py, delaySize, &readIndex, 1,
simon 0:1014af42efd9 153 (int32_t *) pb, (int32_t *) pb, blockSize, 1,
simon 0:1014af42efd9 154 blockSize);
simon 0:1014af42efd9 155
simon 0:1014af42efd9 156 /* Working pointer for the scratch buffer */
simon 0:1014af42efd9 157 px = pb;
simon 0:1014af42efd9 158
simon 0:1014af42efd9 159 /* Working pointer for destination buffer */
simon 0:1014af42efd9 160 pOut = pDst;
simon 0:1014af42efd9 161
simon 0:1014af42efd9 162 /* Loop over the blockSize. Unroll by a factor of 4.
simon 0:1014af42efd9 163 * Compute 4 Multiplications at a time. */
simon 0:1014af42efd9 164 blkCnt = blockSize >> 2u;
simon 0:1014af42efd9 165
simon 0:1014af42efd9 166 while(blkCnt > 0u)
simon 0:1014af42efd9 167 {
simon 0:1014af42efd9 168 /* Perform Multiplications and store in destination buffer */
simon 0:1014af42efd9 169 *pOut++ = *px++ * coeff;
simon 0:1014af42efd9 170 *pOut++ = *px++ * coeff;
simon 0:1014af42efd9 171 *pOut++ = *px++ * coeff;
simon 0:1014af42efd9 172 *pOut++ = *px++ * coeff;
simon 0:1014af42efd9 173
simon 0:1014af42efd9 174 /* Decrement the loop counter */
simon 0:1014af42efd9 175 blkCnt--;
simon 0:1014af42efd9 176 }
simon 0:1014af42efd9 177
simon 0:1014af42efd9 178 /* If the blockSize is not a multiple of 4,
simon 0:1014af42efd9 179 * compute the remaining samples */
simon 0:1014af42efd9 180 blkCnt = blockSize % 0x4u;
simon 0:1014af42efd9 181
simon 0:1014af42efd9 182 while(blkCnt > 0u)
simon 0:1014af42efd9 183 {
simon 0:1014af42efd9 184 /* Perform Multiplications and store in destination buffer */
simon 0:1014af42efd9 185 *pOut++ = *px++ * coeff;
simon 0:1014af42efd9 186
simon 0:1014af42efd9 187 /* Decrement the loop counter */
simon 0:1014af42efd9 188 blkCnt--;
simon 0:1014af42efd9 189 }
simon 0:1014af42efd9 190
simon 0:1014af42efd9 191 /* Load the coefficient value and
simon 0:1014af42efd9 192 * increment the coefficient buffer for the next set of state values */
simon 0:1014af42efd9 193 coeff = *pCoeffs++;
simon 0:1014af42efd9 194
simon 0:1014af42efd9 195 /* Read Index, from where the state buffer should be read, is calculated. */
simon 0:1014af42efd9 196 readIndex = ((int32_t) S->stateIndex - (int32_t) blockSize) - *pTapDelay++;
simon 0:1014af42efd9 197
simon 0:1014af42efd9 198 /* Wraparound of readIndex */
simon 0:1014af42efd9 199 if(readIndex < 0)
simon 0:1014af42efd9 200 {
simon 0:1014af42efd9 201 readIndex += (int32_t) delaySize;
simon 0:1014af42efd9 202 }
simon 0:1014af42efd9 203
simon 0:1014af42efd9 204 /* Loop over the number of taps. */
simon 0:1014af42efd9 205 tapCnt = (uint32_t) numTaps - 1u;
simon 0:1014af42efd9 206
simon 0:1014af42efd9 207 while(tapCnt > 0u)
simon 0:1014af42efd9 208 {
simon 0:1014af42efd9 209
simon 0:1014af42efd9 210 /* Working pointer for state buffer is updated */
simon 0:1014af42efd9 211 py = pState;
simon 0:1014af42efd9 212
simon 0:1014af42efd9 213 /* blockSize samples are read from the state buffer */
simon 0:1014af42efd9 214 arm_circularRead_f32((int32_t *) py, delaySize, &readIndex, 1,
simon 0:1014af42efd9 215 (int32_t *) pb, (int32_t *) pb, blockSize, 1,
simon 0:1014af42efd9 216 blockSize);
simon 0:1014af42efd9 217
simon 0:1014af42efd9 218 /* Working pointer for the scratch buffer */
simon 0:1014af42efd9 219 px = pb;
simon 0:1014af42efd9 220
simon 0:1014af42efd9 221 /* Working pointer for destination buffer */
simon 0:1014af42efd9 222 pOut = pDst;
simon 0:1014af42efd9 223
simon 0:1014af42efd9 224 /* Loop over the blockSize. Unroll by a factor of 4.
simon 0:1014af42efd9 225 * Compute 4 MACS at a time. */
simon 0:1014af42efd9 226 blkCnt = blockSize >> 2u;
simon 0:1014af42efd9 227
simon 0:1014af42efd9 228 while(blkCnt > 0u)
simon 0:1014af42efd9 229 {
simon 0:1014af42efd9 230 /* Perform Multiply-Accumulate */
simon 0:1014af42efd9 231 *pOut++ += *px++ * coeff;
simon 0:1014af42efd9 232 *pOut++ += *px++ * coeff;
simon 0:1014af42efd9 233 *pOut++ += *px++ * coeff;
simon 0:1014af42efd9 234 *pOut++ += *px++ * coeff;
simon 0:1014af42efd9 235
simon 0:1014af42efd9 236 /* Decrement the loop counter */
simon 0:1014af42efd9 237 blkCnt--;
simon 0:1014af42efd9 238 }
simon 0:1014af42efd9 239
simon 0:1014af42efd9 240 /* If the blockSize is not a multiple of 4,
simon 0:1014af42efd9 241 * compute the remaining samples */
simon 0:1014af42efd9 242 blkCnt = blockSize % 0x4u;
simon 0:1014af42efd9 243
simon 0:1014af42efd9 244 while(blkCnt > 0u)
simon 0:1014af42efd9 245 {
simon 0:1014af42efd9 246 /* Perform Multiply-Accumulate */
simon 0:1014af42efd9 247 *pOut++ += *px++ * coeff;
simon 0:1014af42efd9 248
simon 0:1014af42efd9 249 /* Decrement the loop counter */
simon 0:1014af42efd9 250 blkCnt--;
simon 0:1014af42efd9 251 }
simon 0:1014af42efd9 252
simon 0:1014af42efd9 253 /* Load the coefficient value and
simon 0:1014af42efd9 254 * increment the coefficient buffer for the next set of state values */
simon 0:1014af42efd9 255 coeff = *pCoeffs++;
simon 0:1014af42efd9 256
simon 0:1014af42efd9 257 /* Read Index, from where the state buffer should be read, is calculated. */
simon 0:1014af42efd9 258 readIndex = ((int32_t) S->stateIndex -
simon 0:1014af42efd9 259 (int32_t) blockSize) - *pTapDelay++;
simon 0:1014af42efd9 260
simon 0:1014af42efd9 261 /* Wraparound of readIndex */
simon 0:1014af42efd9 262 if(readIndex < 0)
simon 0:1014af42efd9 263 {
simon 0:1014af42efd9 264 readIndex += (int32_t) delaySize;
simon 0:1014af42efd9 265 }
simon 0:1014af42efd9 266
simon 0:1014af42efd9 267 /* Decrement the tap loop counter */
simon 0:1014af42efd9 268 tapCnt--;
simon 0:1014af42efd9 269 }
simon 0:1014af42efd9 270
simon 0:1014af42efd9 271 }
simon 0:1014af42efd9 272
simon 0:1014af42efd9 273 /**
simon 0:1014af42efd9 274 * @} end of FIR_Sparse group
simon 0:1014af42efd9 275 */