dsp - CMSIS DSP Library from CMSIS 2.0. See http://www.…

Users » simon » Code » dsp

CMSIS DSP Library from CMSIS 2.0. See http://www.onarm.com/cmsis/ for full details

Dependents: K22F_DSP_Matrix_least_square BNO055-ELEC3810 1BNO055 ECE4180Project--Slave2 ... more

src/Cortex-M4-M3/BasicMathFunctions/arm_dot_prod_q15.c@0:1014af42efd9, 2011-03-10 (annotated)

Committer:: simon
Date:: Thu Mar 10 15:07:50 2011 +0000
Revision:: 0:1014af42efd9

Who changed what in which revision?

User	Revision	Line number	New contents of line
simon	0:1014af42efd9	1	/* ----------------------------------------------------------------------
simon	0:1014af42efd9	2	* Copyright (C) 2010 ARM Limited. All rights reserved.
simon	0:1014af42efd9	3	*
simon	0:1014af42efd9	4	* $Date: 29. November 2010
simon	0:1014af42efd9	5	* $Revision: V1.0.3
simon	0:1014af42efd9	6	*
simon	0:1014af42efd9	7	* Project: CMSIS DSP Library
simon	0:1014af42efd9	8	* Title: arm_dot_prod_q15.c
simon	0:1014af42efd9	9	*
simon	0:1014af42efd9	10	* Description: Q15 dot product.
simon	0:1014af42efd9	11	*
simon	0:1014af42efd9	12	* Target Processor: Cortex-M4/Cortex-M3
simon	0:1014af42efd9	13	*
simon	0:1014af42efd9	14	* Version 1.0.3 2010/11/29
simon	0:1014af42efd9	15	* Re-organized the CMSIS folders and updated documentation.
simon	0:1014af42efd9	16	*
simon	0:1014af42efd9	17	* Version 1.0.2 2010/11/11
simon	0:1014af42efd9	18	* Documentation updated.
simon	0:1014af42efd9	19	*
simon	0:1014af42efd9	20	* Version 1.0.1 2010/10/05
simon	0:1014af42efd9	21	* Production release and review comments incorporated.
simon	0:1014af42efd9	22	*
simon	0:1014af42efd9	23	* Version 1.0.0 2010/09/20
simon	0:1014af42efd9	24	* Production release and review comments incorporated.
simon	0:1014af42efd9	25	*
simon	0:1014af42efd9	26	* Version 0.0.7 2010/06/10
simon	0:1014af42efd9	27	* Misra-C changes done
simon	0:1014af42efd9	28	* -------------------------------------------------------------------- */
simon	0:1014af42efd9	29
simon	0:1014af42efd9	30	#include "arm_math.h"
simon	0:1014af42efd9	31
simon	0:1014af42efd9	32	/**
simon	0:1014af42efd9	33	* @ingroup groupMath
simon	0:1014af42efd9	34	*/
simon	0:1014af42efd9	35
simon	0:1014af42efd9	36	/**
simon	0:1014af42efd9	37	* @addtogroup dot_prod
simon	0:1014af42efd9	38	* @{
simon	0:1014af42efd9	39	*/
simon	0:1014af42efd9	40
simon	0:1014af42efd9	41	/**
simon	0:1014af42efd9	42	* @brief Dot product of Q15 vectors.
simon	0:1014af42efd9	43	* @param[in] *pSrcA points to the first input vector
simon	0:1014af42efd9	44	* @param[in] *pSrcB points to the second input vector
simon	0:1014af42efd9	45	* @param[in] blockSize number of samples in each vector
simon	0:1014af42efd9	46	* @param[out] *result output result returned here
simon	0:1014af42efd9	47	* @return none.
simon	0:1014af42efd9	48	*
simon	0:1014af42efd9	49	* <b>Scaling and Overflow Behavior:</b>
simon	0:1014af42efd9	50	* \par
simon	0:1014af42efd9	51	* The intermediate multiplications are in 1.15 x 1.15 = 2.30 format and these
simon	0:1014af42efd9	52	* results are added to a 64-bit accumulator in 34.30 format.
simon	0:1014af42efd9	53	* Nonsaturating additions are used and given that there are 33 guard bits in the accumulator
simon	0:1014af42efd9	54	* there is no risk of overflow.
simon	0:1014af42efd9	55	* The return result is in 34.30 format.
simon	0:1014af42efd9	56	*/
simon	0:1014af42efd9	57
simon	0:1014af42efd9	58	void arm_dot_prod_q15(
simon	0:1014af42efd9	59	q15_t * pSrcA,
simon	0:1014af42efd9	60	q15_t * pSrcB,
simon	0:1014af42efd9	61	uint32_t blockSize,
simon	0:1014af42efd9	62	q63_t * result)
simon	0:1014af42efd9	63	{
simon	0:1014af42efd9	64	q63_t sum = 0; /* Temporary result storage */
simon	0:1014af42efd9	65	uint32_t blkCnt; /* loop counter */
simon	0:1014af42efd9	66
simon	0:1014af42efd9	67
simon	0:1014af42efd9	68	/loop Unrolling /
simon	0:1014af42efd9	69	blkCnt = blockSize >> 2u;
simon	0:1014af42efd9	70
simon	0:1014af42efd9	71	/* First part of the processing with loop unrolling. Compute 4 outputs at a time.
simon	0:1014af42efd9	72	** a second loop below computes the remaining 1 to 3 samples. */
simon	0:1014af42efd9	73	while(blkCnt > 0u)
simon	0:1014af42efd9	74	{
simon	0:1014af42efd9	75	/* C = A[0]* B[0] + A[1]* B[1] + A[2]* B[2] + .....+ A[blockSize-1]* B[blockSize-1] */
simon	0:1014af42efd9	76	/* Calculate dot product and then store the result in a temporary buffer. */
simon	0:1014af42efd9	77	sum = __SMLALD(__SIMD32(pSrcA)++, __SIMD32(pSrcB)++, sum);
simon	0:1014af42efd9	78	sum = __SMLALD(__SIMD32(pSrcA)++, __SIMD32(pSrcB)++, sum);
simon	0:1014af42efd9	79
simon	0:1014af42efd9	80	/* Decrement the loop counter */
simon	0:1014af42efd9	81	blkCnt--;
simon	0:1014af42efd9	82	}
simon	0:1014af42efd9	83
simon	0:1014af42efd9	84	/* If the blockSize is not a multiple of 4, compute any remaining output samples here.
simon	0:1014af42efd9	85	** No loop unrolling is used. */
simon	0:1014af42efd9	86	blkCnt = blockSize % 0x4u;
simon	0:1014af42efd9	87
simon	0:1014af42efd9	88	while(blkCnt > 0u)
simon	0:1014af42efd9	89	{
simon	0:1014af42efd9	90	/* C = A[0]* B[0] + A[1]* B[1] + A[2]* B[2] + .....+ A[blockSize-1]* B[blockSize-1] */
simon	0:1014af42efd9	91	/* Calculate dot product and then store the results in a temporary buffer. */
simon	0:1014af42efd9	92	sum = __SMLALD(pSrcA++, pSrcB++, sum);
simon	0:1014af42efd9	93
simon	0:1014af42efd9	94	/* Decrement the loop counter */
simon	0:1014af42efd9	95	blkCnt--;
simon	0:1014af42efd9	96	}
simon	0:1014af42efd9	97
simon	0:1014af42efd9	98	/* Store the result in the destination buffer in 34.30 format */
simon	0:1014af42efd9	99	*result = sum;
simon	0:1014af42efd9	100	}
simon	0:1014af42efd9	101
simon	0:1014af42efd9	102	/**
simon	0:1014af42efd9	103	* @} end of dot_prod group
simon	0:1014af42efd9	104	*/

Repository toolbox

Export to desktop IDE

Repository details

Type:	Library
Created:	10 Mar 2011
Imports:	907
Forks:	1
Commits:	3
Dependents:	5
Dependencies:	0
Followers:	35

src/Cortex-M4-M3/BasicMathFunctions/arm_dot_prod_q15.c@0:1014af42efd9, 2011-03-10 (annotated)

Who changed what in which revision?

Repository toolbox

Repository details

Important Information for this Arm website

Access Warning