3. 3D Audio VS. Surround Sound
Tricking your auditory system to where the sound is
coming from using
Head Related Transfer Function(HRTF)
4. Our Sound localization
• Azimuth angles are good.
• Elevation angles, not so
much.
• “Unlike our eyes that
directly perceive 3D, our
ears have to get that
“computed” in the brain”.
5. Localization Cues
• ITD
-Doesn’t work at high frequency
• ILD
- bad at low frequency
- works better if there
is headshadow effect
• Spectral cues (HRTFs)
7. HRTF
Transfer function of one’s sounds
localization system from a point in space.
Includes shape of the pinna, shoulders
effect, your hair,…Left
Right
HRTF_L
HRTF_R
Conv(Input (Mono) & Impulse Response(L,R)) = Output(L,R)
Input : Desired Sound
Impulse Response : HRTFalpha (L,R) => Desired Direction
Output : Sound interpreted from angle alpha
0 degrees
alpha
alpha
MLS + Chrip
signal
Time domain
alpha
OutalphaLeft
OutalphaRight
alpha
….
Recorded Signal HRTF for angle alpha
3D Audio
Reconstruction
8. HRTF
• These HRTFs includes loudspeakers,
microphones, ADC and the room impulse
response as well.
10. Headphone
3D Audio playback through headphones
- Sounds inside the head.
- Front and back confusion.
- No crosstalk problem.
? ?
11. But Why Inside the Head?
• Some say it’s,
- Lack of bone conduction.
Bone conduction is
the conduction of sound
to the inner ear through
the bones of the skull.
13. Others say,
Maybe…
- It’s because the transducers are “too”
close to the ears (“Due to reflections and modal
oscillations within the enclosed volume between the headphones
and your ear, the sound at any particular point within that space
may be different than the sound at another point”?).
- Headphone impulse response (only helped with
focusing where the sound is coming from).
14. 3D Audio Playback through Loudspeakers
- Transducers are already externalized.
But there is a different problem with loudspeakers,
Crosstalk.
OutalphaRight
OutalphaLeft
Crosstalk Cancellation (XTC)
XTC
Filter
15. Perfect XTC
Reconstructed SignalsTransfer Matrix
Due to the playback
setup.
Signals received at the ears
Ipsilateral Impulse Response
Contralateral Impulse Response
I
I
HLL: Left speaker’s impulse
response at the left ear.
HLR: Left speaker’s impulse
response at the Right ear.
HRR: Right speaker’s impulse
response at the Right ear.
HRL: Right speaker’s impulse
response at the Left ear.
16. XTC
• Free Field Two Point Source Model.
• HRTF-Based XTC
AcousticsHRTF Based
+ Head-shadow effect
+ Individual Sound localization ( Spectral Cues)
+ Room Impulse Response
Not
Does not Consider
+ Individual to one person.
+ Individual to one room.
Considers
18. Spectral Coloration
Listening Room Set up
+ Speaker-Listener Distance : 1.6m
+ Speaker Span : 18 degrees
+ Distance between ears: 15 cm
+ Approximate Time Delay : 65 us
Perfect XTC Response at the loudspeaker
35 dB boost around 7 kHz
Lowering the amplitude around 4 kHz
Perfect XTC
- High level of XTC cannot be
achieved in practice.
- Introduces severe Spectral Coloration
to the sound received at the ear.
19. Angle Between the Speakers
• Smaller speaker span : Shifting the peaks in PXTC to
non-audible range.
Smaller speaker span
Low Frequency rolloffs
is still a problem.
20 kHz
20. Constant Regularization
+ Shifting the whole Transfer
Matrix upward before inversion by
a constant value, Beta, to avoid
high level boost in the XTC.
PXTC
Beta = 0.005
Beta = 0.05
Low frequency roll off.
Doublet peak
Optimized XTC
Taken from, Edgar Y. Choueiri, Optimal
Crosstalk Cancellation for Binaural
Audio with Two Loudspeakers
Frequency
Dependent
+ Shifting the Transfer Matrix
only when necessary based on
the frequency, as done in
BACCH XTC, to avoid low
frequency roll offs and doublet
peaks.
Reducing the Spectral Coloration at
the cost of losing some XTC.
22. Optimal Source Distribution
“Conceptual monopole transducers whose
position varies continuously as a function of
frequency.”
“The inverse filters have flat frequency
response so there is little colouration due to
different HRTFs nor at any location in the
listening room. ”
24. OSD Reference
•Takeuchi, T. and Nelson, P.A. (2008) Extension of the optimal
source distribution for binaural sound, Acta Acustica united with
Acustica, 94(6), 981-987
•Takeuchi, T. and Nelson, P.A. (2007) Subjective and objective
evaluation of the optimal source distribution for virtual acoustic
imaging, Journal of the Audio Engineering Society, 55(11), 981-
997
•akeuchi, T. and Nelson, P.A. (2002) Optimal source
distribution for binaural synthesis over loudspeakers, Journal of
the Acoustical Society of America, 112(6), 2786-97
25. BACCH
BACCH™ 3D Sound developed by Prof. Edgar
Choueiri and is licensed by Princeton University. BACCH
yields unprecedented spatial realism in loudspeakers-based
audio playback allowing the listener to hear, through only
two loudspeakers.
Technical Paper
29. Limited Sweet-spots
Dynasonix 3D Audio
“The camera identifies each listener
and "tracks" them. Dynasonix uses
this feedback and moves the 3D
sound field accordingly. This
ensures that, once identified, the
listener will always receive 3D
sound. A single Dynasonix system
provides 3D sound for up to 6
listeners.”
In May 2013 our digital sound projector
technology was sold to Yamaha for an
undisclosed fee!
30. 3D Audio Reproduction
• Binaural Sound
• Microphone Arrays (beamforming)
em32 Eigenmike produced
by mh acoustics
Lockwood array
31. General Idea
X DirectionY Direction
Z DirectionOmni Direction
Beamforming using least
squares .
Forming beams towards
every source to create a
realistic sound field.
what, how, and why. Don’t waste time on “why.” You should have answered this already with your Overview and Problem/Opportunity slides. if you do something that’s never (or seldom) been done before, then focus on “what.” If you do something that has been done before, but you do it much easier, faster, or cheaper, then focus on “how”Read More http://blog.guykawasaki.com/2012/01/how-to-create-an-enchanting-pitch-officeandguyk.html#ixzz2XmCWQZsv