|
Texas Instruments provides a collection of software algorithm benchmarks
for the software compatible TMS320C6000™
DSPs. The benchmarks are optimized
to provide the best possible C6000 DSP performance in a wide range of fixed-and
floating-point applications including targeted broadband infrastructure, performance
audio and imaging. Each benchmark is C-callable for ease of customer integration
and faster time-to-market. Benchmarks are available for the TMS320C67x™
DSPs.
To access the source code associated with each benchmark listed below, download
the C67x
DSP Library now.
For additional software support (formerly called TI Foundation Software),
please access the C6000
Signal Processing Libraries or the C6000
Peripheral Drivers.
TMS320C67x DSP Functions:
Filters | FFTs
| Vector | Search
| Miscellaneous
| Filters
|
| Benchmark |
Description |
Formula |
| SP Complex FIR Filter |
Single Precision floating point FIR filter for complex input data with
nr output samples and nh coefficients. |
2 * nh * nr + 33
For nh=24 and nr=64, cycles=3105
For nx=32 and nr=64, cycles=4129 |
| SP FIR Filter (general purpose) |
Computes a single precision floating point real FIR filter (direct-form)
with nh coefficients and nr output samples. |
4*floor((nh-1)/2)+14)*(ceil(nr/4)) + 8
For nh=10 and nr=100, cycles=558 cycles |
| SP FIR Filter (radix2) |
Computes a single precision floating point real radix-2 FIR filter (direct-form)
with nh coefficients and nr output samples. |
(nh * nr)/2 + 34, if nr multiple of 4
(nh * nr)/2 + 45, if nr not multiple of 4
For nh=24 and nr=64, cycles=802
For nh=30 and nr=50, cycles=795 |
| SP FIR Filter with circular addressing |
Computes a single precision floating point real FIR filter with nh coefficients
and nr output samples. The routine uses a circular addressing scheme on
the input buffer. |
(2*nh + 10) nr/4 + 18
For nh = 30 and nr=100, cycles = 1768 |
| SP 2nd order IIR (Biquad) Filter |
Computes a single precision floating point biquad filter (Direct Form
II Trasposed) for nx output samples. |
4 * nx + 76
For nx = 60, cycles = 316
For nx = 90, cycles = 436 |
| SP Adaptive LMS Filter |
Computes a single precision floating point real all-pole IIR filter in
lattice structure (AR lattice) with nk lattice stages and nx output samples. |
6*floor((nk+1)/4) + 29)* nx + 25
For nk = 10, nx = 100, cycles = 4125 |
| SP IIR Filter |
Computes a single precision floating point auto-regressive moving-average
(ARMA) filter with 4 auto-regressive filter coefficients and 5 moving-average
filter coefficients for nr output samples. This routine is used as a high
pass filter in the VSELP vocoder. |
6 * nr + 59
For nr = 64, cycles = 443 |
| SP All-pole IIR Lattice Filter |
Computes a single precision floating point real all-pole IIR filter in
lattice structure (AR lattice) with nk lattice stages and nx output samples. |
(6*floor((nk+1)/4) + 29)* nx + 25
For nk = 10, nx = 100 cycles = 4125 |
| SP Convolution |
Computes the single precision floating point full-length convolution of
real vectors of length nr and nh. |
(nh/2)*nr + (nr/2)*5 + 9
For nh=24 and nr=64, cycles=937
For nh=20 and nr=32, cycles=409 |
| SP Autocorrelation |
Performs nr single precision floatin point autocorrelations each of length
nx producing nr output results. |
(nx/2) * nr + (nr/2) * 5 + 10 - (nr * nr)/4 + nr
For nx=64 and nr=64, cycles=1258
For nx=60 and nr=32, cycles=890 |
|
Back to Top
| FFTs
|
| Benchmark |
Description |
Formula |
| SP Complex DIF FFT (radix4) |
Computes a single precision floating point Radix 4 FFT of a complex sequence
of size n with "decimation-in-frequency decomposition" method.
This routine also performs digit reversal as a special last step. |
(14*n/4 + 23)*log4(n) + 20
For n = 256, cycles = 3696 |
| SP Complex DIT FFT (radix2) |
Computes a single precision floating point Radix 2 FFT of a complex sequence
of size n with "decimation-in-time decomposition" method. This
routine also performs digit reversal as a special last step. |
(2 * n * log(base-2) n) + 42
For n = 64, Cycles = 810 |
| SP Out-of-Place Cache-optimized mixed radix FFT with digit reversal |
Computes a single precision floating point complex forward mixed radix
N-point FFT with digit reversal. |
3*ceil(log4(N)-1)*N + 21 * ceil(log4(N)-1) + 2*N + 44
For N=1024, cycles=14464
For N=512, cycles=7296
For N=256, cycles=2923
For N=128, cycles=1515
For N=64, cycles=598 |
| SP Cache-optimized mixed radix Inverse FFT |
Computes a single precision floating point complex forward mixed radix
N-point inverse FFT. |
3 * ceil(log4(N)-1) * N + 21*ceil(log4(N)-1) + 2*N + 44
For N=1024, cycles=14464
For N=512, cycles=7296
For N=256, cycles=2923
For N=128, cycles=1515
For N=64, cycles=598 |
| SP Complex DIF Inverse FFT (radix2) |
Computes a single precision floating opint Radix 2 Inverse FFT of a complex
sequence of size n with “decimation-in-frequency decomposition”
method. |
2*n*log2(n) + 37
For n=64, cycles = 805
For n=128, cycles = 1829 |
| SP Complex Bit Reverse |
Performs the single precision floating point bit-reversal of nx complex
inputs. |
(5/2)nx + 26
For nx = 256, cycles = 666 |
|
Back to Top
| Vector
|
| Benchmark |
Description |
Formula |
| SP Vector Dot Product |
Computes single precision floating point dot product of two vectors of size nx elements. |
nx/2 + 25
For nx=512,cycles=281 |
| SP Vector Dot Product and Sum of Squares |
Performs single precision floating point nx-element dot product and each
of the single precision floating point nx elements of one of the vectors
is squared and accumulated. This is used to compute G in the VSELP coder. |
nx + 23
For nx=64,cycles=87
For nx=30,cycles=53 |
| SP Complex Vector Dot Product |
Computes single precision floating point dot product of two vectors of
N complex elements. |
2*N + 22
For N=512,cycles=1046 |
| SP Vector Reciprocal |
Computes single precision floating point reciprocal of all n elements
in a vector. |
8*floor((n-1)/4) + 53
For n=100, cycles=245 |
| SP Vector Multiplication |
Computes single precision floating point element-by-element vector multiplication
of two size n vectors. |
2*floor((n-1)/2) + 18
For n=200, cycles=216 |
| SP Sum of Squares |
Computes the single precision floating point sum of squares of the elements
in a vector of size n elements. |
floor((n-1)/2) + 26
For n=200, cycles=125 |
| SP Weighted Vector Sum |
Performs an single precision floating point nr element vector sum of two
vectors with one vector weighted by constant. The result is stored in a
third vector. |
2*floor((n-1)/2) + 19
For n=200, cycles=219 |
| SP Matrix Multiplication |
Performs a single precision floating point multiply of an r1 x c1 matrix
with an c1 x c2 matrix. |
(0.5 * r1' * c1 * c2') + (6 * c2' * r1') + (4 * r1') + 22
where
r1' = r1 + (r1&1)
c2' = c2 + (c2&1)
For r1 = 12, c1 = 14 and c2 = 18, cycles = 2878 |
| SP Complex Matrix Multiplication |
Performs a single precision floating point multiply of an r1 x c1 complex
matrix with an c1 x c2 complex matrix. |
2*r1*c1*c2' + 33 where c2'=2*ceil(c2/2)
For r1=3, c1=4, c2=4, cycles = 129
For r1=4, c1=4, c2=5, cycles = 225 |
| SP Matrix Transpose |
Transposes a single precision floating point matrix with dimensions rows
x columns |
2 * rows * cols + 7
For rows=10 and cols=20, cycles=407
For rows=15 and cols=20, cycles=607 |
|
Back to Top
| Search
|
| Benchmark |
Description |
Formula |
| SP Minimum Value of a Vector |
Finds the element with mimimum value in a vector of size nx single precision
floating point elements. |
3*ceil(nx/6) + 35
For nx=60 cycles=65
For nx=34 cycles=53 |
| SP Maximum Value of a Vector |
Finds the element with maximum value in a vector of size nx single precision
floating point elements. |
3*ceil(nx/6) + 35
For nx=60, cycles=65
For nx=34, cycles=53 |
| SP Index of the Maximum Element of a Vector |
Finds the index of the element with the maximum value in a vector of size
nx single precision floating point elements. |
2*nx/3 + 13
For nx=60, cycles=53
For nx=30, cycles=33 |
| SP Minimum Energy Error Search |
Performs a dot product on 256 pairs of 9 single preciision element vectors
and searches for the pair of vectors which produces the maximum dot product
result. This is a large part of the VSELP vocoder codebook search. |
1188 cycles |
|
Back to Top
| Miscellaneous
|
| Benchmark |
Description |
Formula |
| SP Block Move |
Moves a block of nx single precision floating point values to another
location in memory. |
2*ceil(nx/2)+7
For nx=64, cycles=71
For nx=25, cycles=33 |
| Endian swap of a block of 16-bit values |
Performs an endian swap of nx 16-bit values in a vector. |
0.625 * nx + 12
For nx=64, cycles=52
For nx=32, cycles=32 |
| Endian swap of a block of 32-bit values |
Performs an endian swap of nx 32-bit values in a vector. |
1.5 * nx + 14
For nx=64, cycles=110
For nx=32, cycles=62 |
| Endian swap of a block of 64-bit values |
Performs an endian swap of nx 64-bit values in a vector. |
3 * nx + 14
For nx=64, cycles=206
For nx=32, cycles=110 |
| Float to Q15 conversion |
Converts nx single precision floating point values in a vector to Q.15
format. Results are rounded towards negative infinity. |
nx + 17
For nx = 512, cycles = 529 |
| Q15 to Float Conversion |
Converst nx Q.15 values in a vector to single precision floating point
values. |
3*floor((nx-1)/4) + 20
For nx = 512, cycles = 401 |
|
Back to Top
|
|