3. Prerequirements
• Cypress FM4-176L-S6E2CC-ETH hardware
• http://www.cypress.com/documentation/development-kitsboards/sk-fm4-176l-s6e2cc-fm4-family-quick-
start-guide
• Audio jack cables (2)
• Install LabVIEW 2017 from http://www.ni.com/nl-be/shop/labview/download.html <Not the NXG
version!>
• Install Keil uvsion from https://www.keil.com/demo/eval/arm.htm <32kb limit edition is ok for
the labs>
• Cypress PDL library (2.1.0 ; not 3.0) from: http://www.cypress.com/documentation/other-
resources/peripheral-driver-library-pdl-release-notes-archive
Vincent Claes
4. Contents
Cypress ARM FM4
20 min
Keil uvision
10 min
LabVIEW oscilloscope
and function generator
2 hours
Programming in Keil
uvision <IDE Setup>
1 hour
Generate a Sine wave
on Cypress ARM FM4
1 hour
Designing FIR filters for
Cypress ARM FM4
4 hours
Case study, remove
unwanted sine wave in
audio stream using a filter
4 hours
Vincent Claes
11. S6E2CC
• 32-bit general purpose series based on ARM Cortex-M4
• 2MB flash memory
• 256 kb SRAM
• DSP and floating point (FPU) functions
• I²S port for communication with Audio codex
• ADC, DAC,…
Vincent Claes
14. Audio codec
• WM8731 (U3) codec connected to I²S and I²C
• Low power stereo with integrated headphone driver
• Independently programming the ADC and DAC sample rate from a single clock
source
• Signals ADCLRC and DACLRC
• CN11 => microphone jack
• CN5 => headphone jack
• CN6 => line-in jack
Vincent Claes
35. Discrete-Time Linear Invariant System
x(n)
input signal
discrete-time
LTI system
y(n)
output signal
{time-varying} {time-varying}
Vincent Claes
36. Classification of signals
Continuous-time vs. discrete-time
Periodic vs. aperiodic
Deterministic vs. random
Energy vs. power
Vincent Claes
37. Continuous-Time vs. Discrete-Time
-1 0 t-2 1 2 3
0
1
u(t)
A continuous -time signal x(t) is defined for all values of time, t
x(t) need not be a continuous function of time, e.g. unit step
A discrete-time signal x(n) = x(nT) is defined only at discrete values of
time t=nT.
the unit step function u(t) is an
example of a continuous-time
signal containing a discontinuity
Vincent Claes
38. Converting from Continuous to Discrete Time
A discrete-time signal may be formed by sampling a continuous-time signal
t
0
x(t)
kT (k+1)T (k+2)T
T
x(t)
x(nT)
sampler
x(t) nT
quantiser
x(n)
Vincent Claes
39. Periodic vs. Aperiodic Signals
A continuous -time signal x(t) is periodic if and only if
The smallest positive value of T for which this is the case is the period of the signal
A discrete-time signal x(n) is periodic if and only if
Any signal that is not periodic is aperiodic.
)()( Ttxtx
)()( Nnxnx
for all t.
for all n.
Vincent Claes
40. Deterministic vs. Random Signals
Deterministic signals are described as algebraic functions of time.
Random (stochastic) signals are described in terms of their statistical
properties.
Vincent Claes
41. Impulse (Delta or Dirac) Function
For continuous-time systems
For discrete-time systems
1)(
0,0)(
dtt
tt
1)(
0,0
0,1
)(
nd
n
n
nd
Vincent Claes
42. Unit Step Function
dttu
t
t
tu
)()(
0,0
0,1
)(
k
ndku
n
n
nu
)()(
0,0
0,1
)(
For continuous-time systems
For discrete-time systems
Vincent Claes
43. Sinusoid Function
)sin()( ttx
)sin()( Tnnx
Sinusoidal signals pass through (any) linear time-invariant system with no change
to their shape.
For continuous-time systems
For discrete-time systems
Vincent Claes
44. Complex Exponential Function
tj
Aetx
)(
Tjn
Aenx
)(
Rotating vector (phasor) in complex plane.
Closely related to sinusoidal signals.
Projections onto real and imaginary axes of complex plane are cosine and sine respectively.
Im
Re
ωt
Asin(ωt)
Acos(ωt)
rotating vector
For continuous-time systems
For discrete-time systems
Vincent Claes
46. Sampling and Reconstruction
Can we recreate from discrete-time samples the continuous-time
signal from which they were taken?
Vincent Claes
47. Digital to Analogue Conversion Using a Zero-
Order Hold
T
1
0 t
t
T
t
h(t)
t
y(t) = y(nT) * h(t)
A zero-order hold (ZOH) has the following impulse response
T is the sampling period
)()( nTyny
digital to analogue conversion using a zero-order hold
y(nT)
Vincent Claes
48. Digital to Analogue Conversion Using a Zero-
Order Hold
T
1
0 t
t
T
t
h(t)
t
y(t) = y(nT) * h(t)
A zero-order hold (ZOH) has the following impulse response
T is the sampling period
)()( nTyny
digital to analogue conversion using a zero-order hold
y(nT)
Vincent Claes
49. Discrete Time Convolution
An arbitrary input signal y(n) may be decomposed into a sum of (delayed)
weighted impulses.
Corresponding output is formed by summing (delayed) weighted impulse responses.
d(n)
Delta sequence
LTI
system
y(n)
Impulse response
-1 0 n1 2 3
h(n)
-1 0 n1 2 3
d(n)
Vincent Claes
50. Convolution
Consider convolution from the point of view of the input signal
Each weighted impulse at the system input results in a weighted impulse response
at the system output.
Each input sample contributes to a number of output sample values.
)2()2()1()1()()0()( ndxndxndxnx
Vincent Claes
51. Convolution
0 n1 2 3
response to x(0)
4 50 n1 2 3 4 5
0 n1 2 3
x(n)
arbitrary input
signal (sequence)
LTI
system
y(n)
impulse response
-1 0 n1 2 3
h(n)
0 n1 2 3
x(n)
0 n1 2 3
4
0 n1 2 3 4 5
4 5
output signal y(n) comprises
sum of responses
0 n1 2 3 4 50 n1 2 3 4 5
0 n1 2 3 4 5
response to x(1)
response to x(3)
response to x(2)
Vincent Claes
52. Convolution
A more practical and useful approach is to consider convolution from the point of
view of the output signal
Each output sample value can be computed based on a number of input
sample values
If impulse response is finite
)2()2()1()1()()0()( ndhnxhnxhny
Nnnh 0,0)(
N
k
knhkxny
0
)()()(
Vincent Claes
53. Convolution
Convolution is a fundamental and important building block in digital signal
processing.
Its implementation is a sum of products.
Single cycle MAC and Harvard architecture are suited to its efficient
computation.
Vincent Claes
54. Properties of Convolution
Convolution involving the delta sequence is particularly straightforward
Commutative property
Associative property
Distributive property
)()(*)( nxndnx
)()(*)( snxsndnx
)()(*)( nKxnKdnx
)(*)()(*)( nanbnbna
))(*)((*)()(*))(*)(( ncnbnancnbna
))()((*)()(*)()(*)(( ncnbnancnanbna
Vincent Claes
55. Correlation
ppmymxpR
m
xy )()()(
Correlation is concerned with determining the degree of similarity between
two signals
Computationally it bears a resemblance to convolution
Vincent Claes
57. Correlation vs. Convolution
The similarities between the computations involved in convolution and
correlation
are coincidental.
Convolution describes the relationships between input signal, output signal
and
impulse response in a LTI system.
Correlation is a method of determining the degree of similarity between two
signals.
Vincent Claes
58. Digital Signal Processing System
ADC DAC
Digital
Signal
processor
Analogue
input signal
Analogue
output signal
CODEC on the audio card
Microcontroller
(ARM Cortex-M4)
Vincent Claes
59. Aliasing – antialiasing filters
ADC DAC
sampling rate 8 kHz1 kHz
input signal output signal
sampled signal
Vincent Claes
60. Aliasing – antialiasing filters
ADC DAC
sampling rate 8 kHz7 kHz
input signal output signal
sampled signal
Vincent Claes
61. Aliasing – antialiasing filters
ADC DAC
sampling rate 8 kHz1 kHz
input signal output signal
sampled signal
cut off frequency 4 kHz
low pass
filtered signal
LPF
Vincent Claes
62. Aliasing – antialiasing filters
ADC DAC
sampling rate 8 kHz
LPF
cut off frequency 4 kHz7 kHz
input signal
low pass
filtered signal
sampled signal
output signal
Vincent Claes
72. Practice
• Generate sine of 500Hz
• Generate sine of 1000Hz
• Generate sine of 2000Hz
• Generate sine of 3000Hz
• You should be able to achieve these simply by changing the initialised contents of
the array sine_table (and by changing the value of the constant LOOP_SIZE
accordingly). Do not change any other program statements. Record the
combinations of LOOP_SIZE and sine_table with which you achieve these results.
Vincent Claes
73. Visualisation of memory contents
• Run the program and then halt it by clicking on the Stop toolbar
button. type the variable name buffer as the Address in the
debugger's Memory 1 window. Set the displayed data type to
Decimal and Float as shown in the figure below.
Vincent Claes
74. Visualisation of memory contents
• Type the following command at the prompt in the debugger's
Command window to save the contents of array buffer to a file in
your project folder.
• SAVE <filename> <start address>, <end address>
• for example, SAVE sinusoid.dat 0x20000848, 0x200009D8
Vincent Claes
77. Audio Sine Generation on ARM FM4 [AUDIO_2]
• Change variable frequency value to
• 1500
• 2573
• 7000
• 3500
• 4500
• Watch the results on your oscilloscope!
Vincent Claes
78. Audio Sine Generation on ARM FM4 [AUDIO_2]
• Change in sine_lut_intr.c :
• sine_table[LOOPLENGTH] = {10000, 10000, 10000, 10000, -10000, -
10000, -10000, -10000};
• Square wave?
Vincent Claes
81. Moving Average Filter on ARM FM4
moving
average
filter
x(n) y(n)
input output
5
)4()3()2()1()(
)(
nxnxnxnxnx
ny
)4(2.0)3(2.0)2(2.0)1(2.0)(2.0)( nxnxnxnxnxny
five point moving average filter
1
0
)(
1
)(
N
i
inx
N
ny
Vincent Claes
82. Moving Average Filter on ARM FM4
0 5 10 15 20
0
1
2
3
4
5
6
7
sample number
samplevalue
input x(n) x output y(n) ●
Vincent Claes
83. Moving Average Filter on ARM FM4
input x(n) x output y(n) ●
0 5 10 15 20
0
1
2
3
4
5
6
7
sample number
samplevalue
Vincent Claes
84. Moving Average Filter on ARM FM4
input x(n) x output y(n) ●
0 5 10 15 20
0
1
2
3
4
5
6
7
sample number
samplevalue
Vincent Claes
85. Moving Average Filter on ARM FM4
input x(n) x output y(n) ●
0 5 10 15 20
0
1
2
3
4
5
6
7
sample number
samplevalue
Vincent Claes
86. Moving Average Filter on ARM FM4
input x(n) x output y(n) ●
0 5 10 15 20
0
1
2
3
4
5
6
7
sample number
samplevalue
Vincent Claes
90. Finite Impulse Response (FIR) Filter
N point moving average filter
1
0
)(
1
)(
N
i
inx
N
ny
1
0
)()()(
N
i
inxihny
Compare this with the conventional representation of an N point FIR filter
h(i) are referred to as the coefficients of the filter
and as the impulse response of the FIR filter
Vincent Claes
91. Finite Impulse Response (FIR) Coefficients
The coefficients, h(i), of an N point moving average
filter each have a value of 1/N
-1 0 1 2 3 4 5 6
0
0.05
0.1
0.15
0.2
0.25
coefficient number
coefficientvalue
Graphical representation of the coefficients of a 5 point
moving average filter, also known as its impulse response
Vincent Claes
109. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Vincent Claes
110. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Vincent Claes
111. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Block Processing using DMA transfers
Vincent Claes
112. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Block Processing using DMA transfers
Vincent Claes
113. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Block Processing using DMA transfers
Vincent Claes
114. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Block Processing using DMA transfers
Vincent Claes
115. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Block Processing using DMA transfers
Vincent Claes
116. ARM FM4: FIR filter implementation [AUDIO_3]
Efficiency update using CMSIS DSP library arm_fir_f32();
Block Processing using DMA transfers
Program GPIO Measurement divided by BUFSIZE
fir_prbs_intr.c 8.25 µs N/A
fir_prbs_dma.c 1.35 ms 10.55 µs
fir_prbs_CMSIS_intr.c 5.96 µs N/A
fir_prbs_CMSIS_dma.c 183 µs 1.43 µs
- Max time available at sample rate 8kHz => 125 µs for the interrupt based programs
- Max time available between consecutive calls to function process_buffer() used in the DMA
applications is BUFSIZE/(fs) = 16 ms
Play with the filter coefficients by including a different header file, the time to compute each output sample will depend on the
number of filter coefficients used. Vincent Claes
117. DMA-base I/O on FM4 S6E2CCA
• DMA: Direct Memory Access
• Transfer data at high speed without using the CPU
• Improves system performance
• Cypress has 2 peripherals that have DMA access for transferring data
• DSTC: Descriptor System Data Transfer Controller
• Descriptor based DMA: instead of saving characteristics of each tansfer into registers
(size of transfer, source address, destination address,…) all the parameters are packed in
32bits descriptors and stored in the RAM memory. This reduces the size of the peripheral
and allows more channels (256 DSTC channels vs 8 DMAC channels).
• DMAC: Direct Memory Access Controller
Vincent Claes
118. DMA-base I/O on FM4 S6E2CCA
• Data from and to I2S peripheral using DMA : use of DSTC
• audio_init() => one DSTC channel to make DMA transfers between
the output buffers arrays (dma_tx_buffer_ping en dma_tx_buffer_pong) and
the I2S peripheral. It generates an interrupt when a transfer of
DMA_BUFFER_SIZE 32-bit samples has completed
• Another DSTC channel is configured to make DMA transfers between the I2S
peripheral and the input buffers in memory (dma_rx_buffer_ping and
dma_rx_buffer_pong). It generates an interrupt when a transfer of
DMA_BUFFER_SIZE 32-bit samples has completed
• The same interrupt service routine (ISR) is used for both DMA processes
Vincent Claes
119. DMA-base I/O on FM4 S6E2CCA
• Actions in routine are:
• Assigning to pointers rx_proc_buffer and tx_proc_buffer the values PING and PONG.
• Switch between buffers dma_tx_buffer_ping, dma_tx_buffer_pong, dma_rx_buffer_ping and
dma_rx_buffer_pong
• Set flags rx_buffer_full and tx_buffer_empty => are used in proc_buffer()
• If rx_proc_buffer is equal to PING, DSTC1 has filled buffer dma_rx_buffer_ping, and this data
is available to process.
• If tx_proc_buffer is equal to PING, DSTC0 transfer has written the contents to buffer
dma_tx_buffer_ping to the I2S peripheral and this buffer is available to be filled with new
data.
Vincent Claes
120. DMA-base I/O on FM4 S6E2CCA
• Function main() is waiting until both rx_buffer_full and tx_buffer_empty
flags are set. This is when both DMA transfers have completed, before
calling function proc_buffer()
• In loop_dma.c function proc_buffer() simply copies the contents of the
most recently filled input buffer (dma_rx_buffer_ping or
dma_rx_buffer_ping), to the most recently emptied output buffer
(dma_tx_buffer_ping or dma_tx_buffer_ping) according to the values of
pointers rx_proc_buffer and tx_proc_buffer. In general frame-based
processing will be carried out in function proc_buffer() using the contents
of the most recently filled input buffer as input and writing output sample
values to the most recently emptied output buffer.
Vincent Claes
121. DMA-base I/O on FM4 S6E2CCA
• DMA transfers will complete, function proc_buffer() will be called,
every DMA_BUFFER_SIZE sampling instants and therefore any
processing must be completed within DMA_BUFFER_SIZE/fs seconds
(strictly speaking, before the next DMA transfer completion)
Vincent Claes
125. My contact details
• Feel free to contact me:
• vincent[dot]claes_at_pxl[dot]be
• https://www.linkedin.com/in/vincentclaes/
• My passion: Teaching students, chips and machines new tricks.
• FPGA, Machine Learning, LabVIEW and startups
• Special thanks to ARM Ltd. and Cypress Semiconductors
Vincent Claes
Notes de l'éditeur
As represented here in the top diagram, the value of a discrete-time sample may be anywhere in the range of x. A widespread use of discrete-time signals is in digital electronic systems and here signals must be quantised so as to belong to a finite set of possible values, e.g. Integer values in the range -32768 to 32767.
Analogue to digital converters typically combine sampling and quantising functions.
The process of quantisation introduces (what may be regarded as) random noise to a signal. This is an important topic in signal processing. However, in these materials we will not consider quantisation noise explicitly. For our purposes we will consider it to have been reduced to negligible levels by the use of a suitably high-resolution analogue to digital converter.
We can predict (determine) the value of a deterministic signal at a particular point in time (using an algebraic equation).
We cannot predict the value of a stochastic signal at a particular point in time.
That’s it for signal classifications.
The following, starting with the delta function, are fundamentally important examples of signals.
Notice that the value of the continuous-time impulse at time t = 0 is undefined. The discrete-time delta function, or Kronecker delta sequence, is altogether easier to get your head around.
The unit step function is found by integrating (or summing) the unit impulse function.
sin(ωt) is one example of a sinusoid. More generally you might want to consider sin(ωt+φ).
Anticlockwise rotation of phasor is associated with positive frequency
We will revisit complex exponentials later, in more detail.
Contra-rotating phasors can combine to form real-valued sinusoidal signals – in other words, real-valued sinusoids may be decomposed into complex exponentials.
These relationships are described by Euler’s formula.
Euler’s formula describes the relationship between complex exponentials and real-valued sinusoids.
The analogue to digital converter mentioned earlier is one part of a digital signal processing system (typically in engineering it is implied that we wish to apply DSP to continuous-time, physical, analogue quantities)
After processing, discrete-time signals are converted back into continuous-time signals using a digital to analogue converter (DAC).
In an electrical engineering sense, the role of the DAC is to convert sample values represented with digital electronic hardware into a continuous analogue voltage waveform.
In signal processing terms, the role of the DAC is to ‘join the dots’ or interpolate between discrete-time sample values.
Can we recreate from discrete-time samples the continuous-time signal from which they were taken?
Convolution of the ZOH impulse response with the sampled signal y(n) results in a continuous-time signal y(t).
Convolution of the ZOH impulse response with the sampled signal y(n) results in a continuous-time signal y(t).
Overall system output y(n) is found by summing the responses to each individual input sample.
Conceptually, FIR filters implement convolution (of input signal and filter coefficients). However, the convolution sum of products is computationally expensive for a large numbers of filter coefficients. A more computationally efficient method of implementing large FIR filters is to transform signals into the frequency domain and carry out the filtering operation there before transforming the result back into the time domain. In other words, FIR filters may not necessarily be implemented using a sum of products computation.
The convolution of ‘something’ with a delta function is ‘something’.
The convolution of ‘something’ with a delayed delta function is a delayed ‘something’.
The convolution of ‘something’ with a weighted delta function is a weighted ‘something’.
Earlier it was mentioned that, in practice, FIR filters might be implemented in the frequency domain, rather than by convolution (sum of products) in the time domain.
However, the DFT is computed by correlating (sum of products) a sequence of samples with a (finite) number of sequences representing individual frequency components.
Did I mention that sum of products is a fundamental operation in DSP?
A basic digital signal processing system implemented using an ARM Cortex-M4 based EVM and an audio card is the focus of the DSP LiB experiments/materials.
The diagram on this slide is more generally applicable than that.
In order to apply DSP to continuous-time analogue signals, the input signal must be converted into digital form (sampling and quantisation) and the digital output signal must be converted back to continuous-time analogue form – it must be reconstructed from digital sample values.
Here’s a slightly different graphical representation of sampling and reconstruction, based on the foregoing.
We will consider the action of the ADC to be sampling.
If we sample a 1 kHz sinusoid at a sampling rate of 8 kHz we might see a sampled signal as shown here.
Those sample values might be reconstructed by the DAC to form the continuous signal shown.
The continuous signal input to the system appears at the output of the system and we are happy!
Now consider the case of a 7 kHz signal sampled at a sampling rate of 8 kHz.
A few slides ago we saw that this might result in the exactly the same sequence of sample values as produced by sampling a 1 kHz sinusoid.
Those sample values would be reconstructed by the DAC to form a 1 kHz sinusoid at the output of the system.
We should not be happy about this!
This is a specific example of the general rule that we should not allow signal components at frequencies greater than or equal to half the sampling rate into our system.
A way of achieving this is to use an antialiasing filter just before the ADC.
The cutoff frequency of the low pass antialiasing filter should be no higher than half the sampling frequency.
In practice, antialiasing filters will not be ideal low pass filters and a small amount of aliasing is to be expected.
The (low pass) characteristics of the antialiasing filter will be similar to those of the (corresponding) reconstruction filter (DAC).
In the example illustrated here, a 1 kHz sinusoid passes through the system (and we are happy!).
If we apply a 7 kHz input signal to the antialiasing filter, it will be blocked (attenuated) and ultimately the system output will be zero.
Whether or not we are happy about this, the fact is that we have a well-functioning digital signal processing system here, and we are seeing that the bandwidth of such a system is limited to half of its sampling frequency.
We have stated previously that the impulse response of an FIR filter is equal to its coefficients.
Let’s look at this in block diagram form, by applying an impulse as an input. All values of x(n) are zero except for x(0).
We are assuming here that the values representing previous input samples, stored in the delay line, are initially all equal to zero.
The filter output is a sum of products. However, since all but one value stored in the delay line is equal to zero, all but one of the product terms is equal to zero.
Initially, the output is equal to the first filter coefficient h(0).
At each sampling instant, the contents of the delay line are shifted one position (to the right in this diagram).
Here, n is equal to 1 and the output of the filter is y(1) = h(1).
And so on ...
The output of the FIR filter has been formed by convolving the input sequence x with the filter coefficients h.
Convolution is sometimes (correctly) described in terms of ‘sliding’ one sequence across another, forming sums (or integrals) of products along the way.
In the context of this block diagram, the input sequence x has slid across the delay line from left to right while the filter coefficients h have remained stationary with respect to the delay line.