dsp - CMSIS DSP Library from CMSIS 2.0. See http://www.…

Users » simon » Code » dsp

CMSIS DSP Library from CMSIS 2.0. See http://www.onarm.com/cmsis/ for full details

Dependents: K22F_DSP_Matrix_least_square BNO055-ELEC3810 1BNO055 ECE4180Project--Slave2 ... more

src/Cortex-M4-M3/TransformFunctions/arm_dct4_q31.c@0:1014af42efd9, 2011-03-10 (annotated)

Committer:: simon
Date:: Thu Mar 10 15:07:50 2011 +0000
Revision:: 0:1014af42efd9

Who changed what in which revision?

User	Revision	Line number	New contents of line
simon	0:1014af42efd9	1	/* ----------------------------------------------------------------------
simon	0:1014af42efd9	2	* Copyright (C) 2010 ARM Limited. All rights reserved.
simon	0:1014af42efd9	3	*
simon	0:1014af42efd9	4	* $Date: 29. November 2010
simon	0:1014af42efd9	5	* $Revision: V1.0.3
simon	0:1014af42efd9	6	*
simon	0:1014af42efd9	7	* Project: CMSIS DSP Library
simon	0:1014af42efd9	8	* Title: arm_dct4_q31.c
simon	0:1014af42efd9	9	*
simon	0:1014af42efd9	10	* Description: Processing function of DCT4 & IDCT4 Q31.
simon	0:1014af42efd9	11	*
simon	0:1014af42efd9	12	* Target Processor: Cortex-M4/Cortex-M3
simon	0:1014af42efd9	13	*
simon	0:1014af42efd9	14	* Version 1.0.3 2010/11/29
simon	0:1014af42efd9	15	* Re-organized the CMSIS folders and updated documentation.
simon	0:1014af42efd9	16	*
simon	0:1014af42efd9	17	* Version 1.0.2 2010/11/11
simon	0:1014af42efd9	18	* Documentation updated.
simon	0:1014af42efd9	19	*
simon	0:1014af42efd9	20	* Version 1.0.1 2010/10/05
simon	0:1014af42efd9	21	* Production release and review comments incorporated.
simon	0:1014af42efd9	22	*
simon	0:1014af42efd9	23	* Version 1.0.0 2010/09/20
simon	0:1014af42efd9	24	* Production release and review comments incorporated.
simon	0:1014af42efd9	25	* -------------------------------------------------------------------- */
simon	0:1014af42efd9	26
simon	0:1014af42efd9	27	#include "arm_math.h"
simon	0:1014af42efd9	28
simon	0:1014af42efd9	29	/**
simon	0:1014af42efd9	30	* @addtogroup DCT4_IDCT4
simon	0:1014af42efd9	31	* @{
simon	0:1014af42efd9	32	*/
simon	0:1014af42efd9	33
simon	0:1014af42efd9	34	/**
simon	0:1014af42efd9	35	* @brief Processing function for the Q31 DCT4/IDCT4.
simon	0:1014af42efd9	36	* @param[in] *S points to an instance of the Q31 DCT4 structure.
simon	0:1014af42efd9	37	* @param[in] *pState points to state buffer.
simon	0:1014af42efd9	38	* @param[in,out] *pInlineBuffer points to the in-place input and output buffer.
simon	0:1014af42efd9	39	* @return none.
simon	0:1014af42efd9	40	* \par Input an output formats:
simon	0:1014af42efd9	41	* Input samples need to be downscaled by 1 bit to avoid saturations in the Q31 DCT process,
simon	0:1014af42efd9	42	* as the conversion from DCT2 to DCT4 involves one subtraction.
simon	0:1014af42efd9	43	* Internally inputs are downscaled in the RFFT process function to avoid overflows.
simon	0:1014af42efd9	44	* Number of bits downscaled, depends on the size of the transform.
simon	0:1014af42efd9	45	* The input and output formats for different DCT sizes and number of bits to upscale are mentioned in the table below:
simon	0:1014af42efd9	46	*
simon	0:1014af42efd9	47	* \image html dct4FormatsQ31Table.gif
simon	0:1014af42efd9	48	*/
simon	0:1014af42efd9	49
simon	0:1014af42efd9	50	void arm_dct4_q31(
simon	0:1014af42efd9	51	const arm_dct4_instance_q31 * S,
simon	0:1014af42efd9	52	q31_t * pState,
simon	0:1014af42efd9	53	q31_t * pInlineBuffer)
simon	0:1014af42efd9	54	{
simon	0:1014af42efd9	55	uint16_t i; /* Loop counter */
simon	0:1014af42efd9	56	q31_t weights = S->pTwiddle; / Pointer to the Weights table */
simon	0:1014af42efd9	57	q31_t cosFact = S->pCosFactor; / Pointer to the cos factors table */
simon	0:1014af42efd9	58	q31_t pS1, pS2, pbuff; / Temporary pointers for input buffer and pState buffer */
simon	0:1014af42efd9	59	q31_t in; /* Temporary variable */
simon	0:1014af42efd9	60
simon	0:1014af42efd9	61
simon	0:1014af42efd9	62	/* DCT4 computation involves DCT2 (which is calculated using RFFT)
simon	0:1014af42efd9	63	* along with some pre-processing and post-processing.
simon	0:1014af42efd9	64	* Computational procedure is explained as follows:
simon	0:1014af42efd9	65	* (a) Pre-processing involves multiplying input with cos factor,
simon	0:1014af42efd9	66	* r(n) = 2 * u(n) * cos(pi(2n+1)/(4*n))
simon	0:1014af42efd9	67	* where,
simon	0:1014af42efd9	68	* r(n) -- output of preprocessing
simon	0:1014af42efd9	69	* u(n) -- input to preprocessing(actual Source buffer)
simon	0:1014af42efd9	70	* (b) Calculation of DCT2 using FFT is divided into three steps:
simon	0:1014af42efd9	71	* Step1: Re-ordering of even and odd elements of input.
simon	0:1014af42efd9	72	* Step2: Calculating FFT of the re-ordered input.
simon	0:1014af42efd9	73	* Step3: Taking the real part of the product of FFT output and weights.
simon	0:1014af42efd9	74	* (c) Post-processing - DCT4 can be obtained from DCT2 output using the following equation:
simon	0:1014af42efd9	75	* Y4(k) = Y2(k) - Y4(k-1) and Y4(-1) = Y4(0)
simon	0:1014af42efd9	76	* where,
simon	0:1014af42efd9	77	* Y4 -- DCT4 output, Y2 -- DCT2 output
simon	0:1014af42efd9	78	* (d) Multiplying the output with the normalizing factor sqrt(2/N).
simon	0:1014af42efd9	79	*/
simon	0:1014af42efd9	80
simon	0:1014af42efd9	81	/-------- Pre-processing ------------/
simon	0:1014af42efd9	82	/* Multiplying input with cos factor i.e. r(n) = 2 * x(n) * cos(pi(2n+1)/(4n)) /
simon	0:1014af42efd9	83	arm_mult_q31(pInlineBuffer, cosFact, pInlineBuffer, S->N);
simon	0:1014af42efd9	84	arm_shift_q31(pInlineBuffer, 1, pInlineBuffer, S->N);
simon	0:1014af42efd9	85
simon	0:1014af42efd9	86	/* ----------------------------------------------------------------
simon	0:1014af42efd9	87	* Step1: Re-ordering of even and odd elements as
simon	0:1014af42efd9	88	* pState[i] = pInlineBuffer[2*i] and
simon	0:1014af42efd9	89	* pState[N-i-1] = pInlineBuffer[2*i+1] where i = 0 to N/2
simon	0:1014af42efd9	90	---------------------------------------------------------------------*/
simon	0:1014af42efd9	91
simon	0:1014af42efd9	92	/* pS1 initialized to pState */
simon	0:1014af42efd9	93	pS1 = pState;
simon	0:1014af42efd9	94
simon	0:1014af42efd9	95	/* pS2 initialized to pState+N-1, so that it points to the end of the state buffer */
simon	0:1014af42efd9	96	pS2 = pState + (S->N - 1u);
simon	0:1014af42efd9	97
simon	0:1014af42efd9	98	/* pbuff initialized to input buffer */
simon	0:1014af42efd9	99	pbuff = pInlineBuffer;
simon	0:1014af42efd9	100
simon	0:1014af42efd9	101	/* Initializing the loop counter to N/2 >> 2 for loop unrolling by 4 */
simon	0:1014af42efd9	102	i = S->Nby2 >> 2u;
simon	0:1014af42efd9	103
simon	0:1014af42efd9	104	/* First part of the processing with loop unrolling. Compute 4 outputs at a time.
simon	0:1014af42efd9	105	** a second loop below computes the remaining 1 to 3 samples. */
simon	0:1014af42efd9	106	do
simon	0:1014af42efd9	107	{
simon	0:1014af42efd9	108	/* Re-ordering of even and odd elements */
simon	0:1014af42efd9	109	/* pState[i] = pInlineBuffer[2i] /
simon	0:1014af42efd9	110	pS1++ = pbuff++;
simon	0:1014af42efd9	111	/* pState[N-i-1] = pInlineBuffer[2i+1] /
simon	0:1014af42efd9	112	pS2-- = pbuff++;
simon	0:1014af42efd9	113
simon	0:1014af42efd9	114	pS1++ = pbuff++;
simon	0:1014af42efd9	115	pS2-- = pbuff++;
simon	0:1014af42efd9	116
simon	0:1014af42efd9	117	pS1++ = pbuff++;
simon	0:1014af42efd9	118	pS2-- = pbuff++;
simon	0:1014af42efd9	119
simon	0:1014af42efd9	120	pS1++ = pbuff++;
simon	0:1014af42efd9	121	pS2-- = pbuff++;
simon	0:1014af42efd9	122
simon	0:1014af42efd9	123	/* Decrement the loop counter */
simon	0:1014af42efd9	124	i--;
simon	0:1014af42efd9	125	} while(i > 0u);
simon	0:1014af42efd9	126
simon	0:1014af42efd9	127	/* pbuff initialized to input buffer */
simon	0:1014af42efd9	128	pbuff = pInlineBuffer;
simon	0:1014af42efd9	129
simon	0:1014af42efd9	130	/* pS1 initialized to pState */
simon	0:1014af42efd9	131	pS1 = pState;
simon	0:1014af42efd9	132
simon	0:1014af42efd9	133	/* Initializing the loop counter to N/4 instead of N for loop unrolling */
simon	0:1014af42efd9	134	i = S->N >> 2u;
simon	0:1014af42efd9	135
simon	0:1014af42efd9	136	/* Processing with loop unrolling 4 times as N is always multiple of 4.
simon	0:1014af42efd9	137	* Compute 4 outputs at a time */
simon	0:1014af42efd9	138	do
simon	0:1014af42efd9	139	{
simon	0:1014af42efd9	140	/* Writing the re-ordered output back to inplace input buffer */
simon	0:1014af42efd9	141	pbuff++ = pS1++;
simon	0:1014af42efd9	142	pbuff++ = pS1++;
simon	0:1014af42efd9	143	pbuff++ = pS1++;
simon	0:1014af42efd9	144	pbuff++ = pS1++;
simon	0:1014af42efd9	145
simon	0:1014af42efd9	146	/* Decrement the loop counter */
simon	0:1014af42efd9	147	i--;
simon	0:1014af42efd9	148	} while(i > 0u);
simon	0:1014af42efd9	149
simon	0:1014af42efd9	150
simon	0:1014af42efd9	151	/* ---------------------------------------------------------
simon	0:1014af42efd9	152	* Step2: Calculate RFFT for N-point input
simon	0:1014af42efd9	153	* ---------------------------------------------------------- */
simon	0:1014af42efd9	154	/* pInlineBuffer is real input of length N , pState is the complex output of length 2N */
simon	0:1014af42efd9	155	arm_rfft_q31(S->pRfft, pInlineBuffer, pState);
simon	0:1014af42efd9	156
simon	0:1014af42efd9	157	/*----------------------------------------------------------------------
simon	0:1014af42efd9	158	* Step3: Multiply the FFT output with the weights.
simon	0:1014af42efd9	159	----------------------------------------------------------------------/
simon	0:1014af42efd9	160	arm_cmplx_mult_cmplx_q31(pState, weights, pState, S->N);
simon	0:1014af42efd9	161
simon	0:1014af42efd9	162	/* The output of complex multiplication is in 3.29 format.
simon	0:1014af42efd9	163	* Hence changing the format of N (i.e. 2N elements) complex numbers to 1.31 format by shifting left by 2 bits. /
simon	0:1014af42efd9	164	arm_shift_q31(pState, 2, pState, S->N * 2);
simon	0:1014af42efd9	165
simon	0:1014af42efd9	166	/* ----------- Post-processing ---------- */
simon	0:1014af42efd9	167	/* DCT-IV can be obtained from DCT-II by the equation,
simon	0:1014af42efd9	168	* Y4(k) = Y2(k) - Y4(k-1) and Y4(-1) = Y4(0)
simon	0:1014af42efd9	169	* Hence, Y4(0) = Y2(0)/2 */
simon	0:1014af42efd9	170	/* Getting only real part from the output and Converting to DCT-IV */
simon	0:1014af42efd9	171
simon	0:1014af42efd9	172	/* Initializing the loop counter to N >> 2 for loop unrolling by 4 */
simon	0:1014af42efd9	173	i = (S->N - 1u) >> 2u;
simon	0:1014af42efd9	174
simon	0:1014af42efd9	175	/* pbuff initialized to input buffer. */
simon	0:1014af42efd9	176	pbuff = pInlineBuffer;
simon	0:1014af42efd9	177
simon	0:1014af42efd9	178	/* pS1 initialized to pState */
simon	0:1014af42efd9	179	pS1 = pState;
simon	0:1014af42efd9	180
simon	0:1014af42efd9	181	/* Calculating Y4(0) from Y2(0) using Y4(0) = Y2(0)/2 */
simon	0:1014af42efd9	182	in = *pS1++ >> 1u;
simon	0:1014af42efd9	183	/* input buffer acts as inplace, so output values are stored in the input itself. */
simon	0:1014af42efd9	184	*pbuff++ = in;
simon	0:1014af42efd9	185
simon	0:1014af42efd9	186	/* pState pointer is incremented twice as the real values are located alternatively in the array */
simon	0:1014af42efd9	187	pS1++;
simon	0:1014af42efd9	188
simon	0:1014af42efd9	189	/* First part of the processing with loop unrolling. Compute 4 outputs at a time.
simon	0:1014af42efd9	190	** a second loop below computes the remaining 1 to 3 samples. */
simon	0:1014af42efd9	191	do
simon	0:1014af42efd9	192	{
simon	0:1014af42efd9	193	/* Calculating Y4(1) to Y4(N-1) from Y2 using equation Y4(k) = Y2(k) - Y4(k-1) */
simon	0:1014af42efd9	194	/* pState pointer (pS1) is incremented twice as the real values are located alternatively in the array */
simon	0:1014af42efd9	195	in = *pS1++ - in;
simon	0:1014af42efd9	196	*pbuff++ = in;
simon	0:1014af42efd9	197	/* points to the next real value */
simon	0:1014af42efd9	198	pS1++;
simon	0:1014af42efd9	199
simon	0:1014af42efd9	200	in = *pS1++ - in;
simon	0:1014af42efd9	201	*pbuff++ = in;
simon	0:1014af42efd9	202	pS1++;
simon	0:1014af42efd9	203
simon	0:1014af42efd9	204	in = *pS1++ - in;
simon	0:1014af42efd9	205	*pbuff++ = in;
simon	0:1014af42efd9	206	pS1++;
simon	0:1014af42efd9	207
simon	0:1014af42efd9	208	in = *pS1++ - in;
simon	0:1014af42efd9	209	*pbuff++ = in;
simon	0:1014af42efd9	210	pS1++;
simon	0:1014af42efd9	211
simon	0:1014af42efd9	212	/* Decrement the loop counter */
simon	0:1014af42efd9	213	i--;
simon	0:1014af42efd9	214	} while(i > 0u);
simon	0:1014af42efd9	215
simon	0:1014af42efd9	216	/* If the blockSize is not a multiple of 4, compute any remaining output samples here.
simon	0:1014af42efd9	217	** No loop unrolling is used. */
simon	0:1014af42efd9	218	i = (S->N - 1u) % 0x4u;
simon	0:1014af42efd9	219
simon	0:1014af42efd9	220	while(i > 0u)
simon	0:1014af42efd9	221	{
simon	0:1014af42efd9	222	/* Calculating Y4(1) to Y4(N-1) from Y2 using equation Y4(k) = Y2(k) - Y4(k-1) */
simon	0:1014af42efd9	223	/* pState pointer (pS1) is incremented twice as the real values are located alternatively in the array */
simon	0:1014af42efd9	224	in = *pS1++ - in;
simon	0:1014af42efd9	225	*pbuff++ = in;
simon	0:1014af42efd9	226	/* points to the next real value */
simon	0:1014af42efd9	227	pS1++;
simon	0:1014af42efd9	228
simon	0:1014af42efd9	229	/* Decrement the loop counter */
simon	0:1014af42efd9	230	i--;
simon	0:1014af42efd9	231	}
simon	0:1014af42efd9	232
simon	0:1014af42efd9	233
simon	0:1014af42efd9	234	/------------ Normalizing the output by multiplying with the normalizing factor ----------/
simon	0:1014af42efd9	235
simon	0:1014af42efd9	236	/* Initializing the loop counter to N/4 instead of N for loop unrolling */
simon	0:1014af42efd9	237	i = S->N >> 2u;
simon	0:1014af42efd9	238
simon	0:1014af42efd9	239	/* pbuff initialized to the pInlineBuffer(now contains the output values) */
simon	0:1014af42efd9	240	pbuff = pInlineBuffer;
simon	0:1014af42efd9	241
simon	0:1014af42efd9	242	/* Processing with loop unrolling 4 times as N is always multiple of 4. Compute 4 outputs at a time */
simon	0:1014af42efd9	243	do
simon	0:1014af42efd9	244	{
simon	0:1014af42efd9	245	/* Multiplying pInlineBuffer with the normalizing factor sqrt(2/N) */
simon	0:1014af42efd9	246	in = *pbuff;
simon	0:1014af42efd9	247	pbuff++ = ((q31_t) (((q63_t) in S->normalize) >> 31));
simon	0:1014af42efd9	248
simon	0:1014af42efd9	249	in = *pbuff;
simon	0:1014af42efd9	250	pbuff++ = ((q31_t) (((q63_t) in S->normalize) >> 31));
simon	0:1014af42efd9	251
simon	0:1014af42efd9	252	in = *pbuff;
simon	0:1014af42efd9	253	pbuff++ = ((q31_t) (((q63_t) in S->normalize) >> 31));
simon	0:1014af42efd9	254
simon	0:1014af42efd9	255	in = *pbuff;
simon	0:1014af42efd9	256	pbuff++ = ((q31_t) (((q63_t) in S->normalize) >> 31));
simon	0:1014af42efd9	257
simon	0:1014af42efd9	258	/* Decrement the loop counter */
simon	0:1014af42efd9	259	i--;
simon	0:1014af42efd9	260	} while(i > 0u);
simon	0:1014af42efd9	261
simon	0:1014af42efd9	262	}
simon	0:1014af42efd9	263
simon	0:1014af42efd9	264	/**
simon	0:1014af42efd9	265	* @} end of DCT4_IDCT4 group
simon	0:1014af42efd9	266	*/

Repository toolbox

Export to desktop IDE

Repository details

Type:	Library
Created:	10 Mar 2011
Imports:	907
Forks:	1
Commits:	3
Dependents:	5
Dependencies:	0
Followers:	35

src/Cortex-M4-M3/TransformFunctions/arm_dct4_q31.c@0:1014af42efd9, 2011-03-10 (annotated)

Who changed what in which revision?

Repository toolbox

Repository details

Important Information for this Arm website

Access Warning