In this presentation, production of digital audio is discussed. Also brief introduction about digital audio broadcast, recording techniques and stereo phony is given.
2. Contents
Digital Audio Fundamentals
Sampling and Quantizing
PCM
Audio Compression
Disk-Based Recording
Rotary Head Digital Recorders
Digital Audio Broadcasting
Digital Filtering
Stereophony and Multichannel Sound
4. Digital Audio Fundamentals
Digital audio is sound reproduction using pulse-code
modulation and digital signals
Digital audio systems include analog-to-digital conversion
(ADC), digital-to-analog conversion (DAC), digital storage,
processing and transmission components
A primary benefit of digital audio is in its convenience of
storage, transmission and retrieval
Digital audio is useful in the recording, manipulation,
mass-production, and distribution of sound
Modern distribution of music across the Internet via on-line
stores depends on digital recording and digital compression
algorithms
5. PCM(Pulse Code Modulation)
PCM consists of three steps to digitize an analog
signal:
1. Sampling
2. Quantization
3. Binary encoding
Before we sample, we have to filter the signal to
limit the maximum frequency of the signal as it
affects the sampling rate.
Filtering should ensure that we do not distort
the signal, ie remove high frequency
components that affect the signal shape.
7. Sampling
Analog signal is sampled every TS secs.
Ts is referred to as the sampling interval.
fs = 1/Ts is called the sampling rate or sampling
frequency.
There are 3 sampling methods:
Ideal - an impulse at each sampling instant
Natural - a pulse of short width with varying amplitude
Flattop - sample and hold, like natural but with single
amplitude value
The process is referred to as pulse amplitude
modulation PAM and the outcome is a signal with
analog (non integer) values
10. Quantization
Sampling results in a series of pulses of varying amplitude
values ranging between two limits: a min and a max
The amplitude values are infinite between the two limits.
We need to map the infinite amplitude values onto a finite
set of known values
This is achieved by dividing the distance between min and
max into L zones, each of height
= (max - min)/L
The midpoint of each zone is assigned a value from 0 to L-1
(resulting in L values)
Each sample falling in a zone is then approximated to the
value of the midpoint
11. Quantization Zones
Assume we have a voltage signal with amplitutes
Vmin=-20V and Vmax=+20V.
We want to use L=8 quantization levels.
Zone width = (20 - -20)/8 = 5
The 8 zones are: -20 to -15, -15 to -10, -10 to -5, -5 to 0, 0
to +5, +5 to +10, +10 to +15, +15 to +20
The midpoints are: -17.5, -12.5, -7.5, -2.5, 2.5, 7.5, 12.5,
17.5
12. Assigning Codes to Zones
Each zone is then assigned a binary code.
The number of bits required to encode the zones,
or the number of bits per sample as it is commonly
referred to, is obtained as follows:
nb = log2 L
Given our example, nb = 3
The 8 zone (or level) codes are therefore: 000, 001,
010, 011, 100, 101, 110, and 111
Assigning codes to zones:
000 will refer to zone -20 to -15
001 to zone -15 to -10, etc.
14. PCM Decoder
To recover an analog signal from a digitized signal
we follow the following steps:
We use a hold circuit that holds the amplitude value of a
pulse till the next pulse arrives.
We pass this signal through a low pass filter with a cutoff
frequency that is equal to the highest frequency in the
pre-sampled signal.
The higher the value of L, the less distorted a
signal is recovered.
16. Audio Compression
In its native form, high-quality digital audio requires a
high data rate, which may be excessive for certain
applications
One approach to the problem is to use compression,
which reduces that rate significantly with a moderate
loss of subjective quality
While compression may achieve considerable
reduction in bit rate, it must be appreciated that
compression systems reintroduce the generation loss
of the analog domain to digital systems
17. Audio Compression
One of the most popular compression standards for
audio and video is known as MPEG (Moving Picture
Experts Group)
In practice, audio and video streams of this type can be
combined using multiplexing
The program stream is optimized for recording and is
based on blocks of arbitrary size
The transport stream is optimized for transmission
and is based on blocks of constant size
19. Audio Compression
Compression and the corresponding decoding are complex
processes and take time, adding to existing delays in signal
paths
Concealment of uncorrectable errors is also more difficult
on compressed data
The acceptable trade-off between loss of audio quality and
transmission or storage size depends upon the application
For example, one 640MB compact disc (CD) holds
approximately one hour of uncompressed high fidelity
music, less than 2 hours of music compressed losslessly, or
7 hours of music compressed in the MP3 format at a
medium bit rate
A digital sound recorder can typically store around 200
hours of clearly intelligible speech in 640MB
20. Disk-Based Recording
The magnetic disk drive was perfected by the computer
industry to allow rapid random access to data, and so it
makes an ideal medium for editing
Development of the optical disk was stimulated by the
availability of low-cost lasers
Optical disks are available in many different types,
some which can only be recorded once, whereas others
are erasable
Optical disks have in common the fact that access is
generally slower than with magnetic drives and that it
is difficult to obtain high data rates, but most of them
are removable and can act as interchange media
21. Rotary Head Digital Recorders
In a fixed tape head system, audio tape is drawn past
the head at a constant speed
The head creates a fluctuating magnetic field in
response to the signal to be recorded, and the
magnetic particles on the tape are forced to line up
with the field at the head
As the tape moves away, the magnetic particles carry
an imprint of the signal in their magnetic orientation
If the tape moves too slowly, a high frequency signal
will not be imprinted: the particles' polarity will
simply oscillate in the vicinity of the head, to be left in
a random position
22. Rotary Head Digital Recorders
Thus the bandwidth channel capacity of the recorded
signal can be seen to be related to tape speed: the faster the
speed, the higher the frequency that can be recorded
Digital video and digital audio need considerably more
bandwidth than analog audio, so much so that tape would
have to be drawn past the heads at very high speed in order
to capture this signal
This is impractical, since tapes of immense length would
be required
The generally adopted solution is to rotate the head against
the tape at high speed, so that the relative velocity is high,
but the tape itself moves at a slow speed.
23. Rotary Head Digital Recorders
To accomplish this, the head must be tilted so that at
each rotation of the head, a new area of tape is
brought into play; each segment of the signal is
recorded as a diagonal stripe across the tape
This is known as a helical scan because the tape wraps
around the circular drum at an angle, travelling up like
a helix
The rotary head recorder has the advantage that the
spinning heads create a high head-to-tape speed,
offering a high bit rate recording without high linear
tape speed
24. Digital Audio Broadcasting
Digital Audio Broadcasting (DAB) is a digital radio
technology for broadcasting radio stations
Advantages of DAB
Broadcasting programs with good sound quality
comparable to multi-media products such as MP3
Offering stable reception and removing noises
Bringing diversified program choices to the audiences
Enabling transmission of text / images
27. Digital Filtering
In electronics, computer science and mathematics,
a digital filter is a system that performs
mathematical operations on a sampled, discrete-
time signal to reduce or enhance certain aspects of
that signal
A digital filter system usually consists of an
analog-to-digital converter to sample the input
signal, followed by a microprocessor and some
peripheral components such as memory to store
data and filter coefficients
28. Digital Filtering
Digital filters are commonplace and an essential
element of everyday electronics such as radios,
cellphones, and stereo receivers
Digital filters are defined by their impulse response,
h[n], or the filter output given a unit sample impulse
input signal
A discrete-time unit impulse signal is defined by
29. Digital Filtering
Digital filters are often best described in terms of their
frequency response. That is, how is a sinusoidal signal of a
given frequency affected by the filter
The frequency response of a digital filter can be found by
taking the DFT (or FFT) of the filter impulse response
The frequency response of a filter consists of its
magnitude and phase responses
The magnitude response indicates the ratio of a filtered
sine wave's output amplitude to its input amplitude
The phase response describes the phase ``offset'' or time
delay experienced by a sine wave passing through a filter
30. Digital Filtering
The filter implementation simply performs a
convolution of the time domain impulse response and
the sampled signal
Convolution is defined as the integral of the product of
the two functions after one is reversed and shifted or
delayed
What happens when we add a signal to a one-sample
delayed version of itself?
y[n] = x[n] + x[n - 1]
32. Digital Filtering
Finite Impulse Response (FIR) Filters
Finite Impulse Response (FIR) filters are defined by
scaled and time-delayed versions of the filter input
signal only, as given by the following difference
equation:
The impulse response of an FIR filter is only as long as
the maximum delayed input term in its difference
equation
34. Digital Filtering
What happens if we use a previous filter output value
to produce the filter's current output?
y[n] = x[n] + y[n - 1]
Consider the following input signals
36. Digital Filtering
Infinite Impulse Response (IIR) filters include delayed
and scaled versions of the output signal which are fed
back into the current output
IIR filters are described by the following difference
equation
38. Stereophony and Multichannel
Sound
It is a method of sound recording in which the
recording contains information about the spatial
arrangement of the sound sources
When a stereophonic recording is reproduced, the
listener hears a more natural sound that seems to
come from many separate sources and to be arranged
in the same way as during the recording
The listener has the impression that the sound is
“three-dimensional” and possessed of an added
“depth.”
39. Stereophony and Multichannel
Sound
This effect is achieved through the separate recording
of electrical signals from different microphones on
individual channels and through the separate
reproduction of the sound on each channel by
loudspeakers
40. Stereophony and Multichannel
Sound
The arrangement of the loudspeakers must be similar
to that of the microphones; that is, the right and left
channels must coincide
The quality of stereophonic sound reproduction
improves with the number of channels used
However, the number of channels is usually kept
within certain limits to avoid undue complexity and
excessive cost
41. Stereophony and Multichannel
Sound
5.1 channel sound is an industry standard sound
format for movies and music with five main channels
of sound and a sixth subwoofer channel used for
special movie effects and bass for music
A 5.1 channel system consists of a stereo pair of
speakers, a center channel speaker placed between the
stereo speakers and two surround sound speakers
located behind the listener. 5.1 channel sound is found
on DVD movie and music discs and some CDs
42. Stereophony and Multichannel
Sound
6.1 channel sound is a sound enhancement to 5.1
channel sound with an additional center surround
sound speaker located between the two surround
sound speakers directly behind the listener. 6.1
channel sound produces a more enveloping surround
sound experience.
7.1 channel sound is a further sound enhancement to
5.1 channel sound with two additional side-surround
speakers located to the sides of the listener’s seating
position. 7.1 channel sound is used for greater sound
envelopment and more accurate positioning of sounds