SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
DESIGNING A GAME AUDIO ENGINE FOR HSA
LAURENT BETBEDER
SCEA
WHAT’S SO SPECIAL ABOUT CONSOLE GAME DEV?
NOW THAT CONSOLES MOSTLY RUN PC HARDWARE

 Extreme performance optimizations
‒ Until gamers opt for shorter upgrade cycles (phones/tablets business model) ?
‒ Can’t run sub-optimal audio code when competing for cycles on crowded compute queues

 Custom hardware, OS, drivers and compilers
‒ To extract max perf from fixed hardware
‒ Helps lengthening platform life time
‒ “But but… where’s my OpenCL runtime?”

 Low latency
‒ Music games on consoles need it as much as professional music prod software on desktop
‒ But is much harder to achieve reliably when a system is constantly overloaded

2 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE ACP
WHY?

 Heavy specialized DSP workloads
‒ Stuff games need badly but don’t really want to deal with
‒ Best fit for dedicated and/or fixed function hardware
‒ Codecs
‒
‒
‒
‒

CELP codecs -> party chat
100s of MP3/AT9/AAC decode instances
Huge impact on game assets footprint, down/load times
Optional output bitstream encoding (AC3/DTS)

‒ Voice recognition
‒ Echo cancelation

 Platform wide IP licensing levels the playing field
‒ Good for indy developers
‒ And good for the platform!

 Available via asynchronous secure system APIs

3 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE ACP
WHY NOT?

 Exotic hardware and dev environment
‒ Closed to games
‒ Closed to middleware
‒ Platform specific

 Asynchronous interface
‒ Can’t have sequential interleaving of DSP back and forth between CPU and ACP w/o latency buildup
‒ But ultimately, we want the DSP pipeline to be data driven (by artists who know nothing about this)
‒ Modularity

 Slow clock rate @ 800MHz, very limited SIMD and no FP support
‒ Tough sell against Jaguar for many DSP algorithms
‒ Very tight local memory shared by multiple DSP cores

 Already pretty busy with codec loads and system tasks

4 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE GPU
WHY?

 Much more demand for real-time effects today and will keep growing

 CPU FLOPS likely to stagnate and could even decline in HSA as CUs takes over SIMD workloads
 Flexibility: some games are CPU bound, others are GPU bound…
 hUMA is a game changer (removes NUMA’s main bottleneck: GPU write back)
 Compute queues with prioritized scheduling and even some form of preemption
 Many real-time audio DSP algorithms work well on wide SIMD units
‒ FFT convolution (spectral processing in general)
‒ Mixing, resampling, wave shaping, etc…

 Mostly coalesced mem accesses
 Low/med bandwidth (< 1GB/s)

5 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON THE GPU
WHY NOT?

 Some algorithms do not work (as) well on wide SIMD units
‒ IIR filters, ADPCM decodes, dynamics: data recursion causes thread interdependencies within wavefronts
‒ Typical AAA game runs 1000s of biquads at various stages in the filtergraph

 Workloads may require batch voice processing to achieve high CU efficiency
‒ Build 2D grids (channels x samples) or 3D grids (channels x subbands x samples)
‒ Swizzling is key but watch out for runtime cost as SIMD widens (static vs dynamic)

 Batch processing goes against free form MaxMSP model artists are pushing for
‒ Unique DSP chain for each sound “just because we can!”
‒ Data driven filtergraph and DSP pipeline

 Complex prioritized scheduling & dispatching compute queues
‒ Do not prevent intermittent CU saturation caused by large graphics workloads
‒ Risky for low latency direct path audio DSP

 Proprietary hardware, drivers and shader compilers (PSSL)
‒ Audio middleware will need a some incentive to move up there
‒ Most will probably stay on the CPU
6 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON JAGUAR
WHY?

 Well known and open x64 dev environment
‒ Middleware friendly
‒ CLANG/LLVM solid & stable

 Full FP unit with SSE4 support

 Early PA is surprisingly good for compiled intrinsics code
‒ ~10% slower than core i7 @ same clock rate
‒ GDDR5 latency is not an issue
‒ < ~50% of 1 core @ 1.6GHz running the entire KZSF filtergraph

 Only reliable solution for ultra low latency
‒ Music and rhythm games
‒ Run 100% on CPU (including decoding)

7 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME AUDIO DSP ON JAGUAR
WHY NOT?

 “Weak laptop CPU” compared to top of the line on desktop
‒ No FMA4
‒ Slow clock @ 1.6GHz (compared to typical desktop)

 256bit AVX mostly useless
 Possible bottleneck down the line

8 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
GAME ENGINE CODE
THIN COMPUTE

 3D audio
‒ Sound emitters (distance, directionality and size modeling)
‒ Sound listeners (mic and ear modeling)
‒ Sound geometry (collision meshes)
‒ Deeper physical modeling of sound propagation
‒ Simple ray casting (occlusion, obstruction, indirect audio)
‒ Advanced ray casting (diffraction, real-time individual early reflection tracking)

 Physics
‒ Rigid body dynamics (collisions, friction, destruction)
‒ Fluid dynamics (turbulences)

 Animation, special FX
‒ Inline audio sequencing and modulation
‒ Foley, coarse granular synthesis

9 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
CONCLUSIONS
 HSA + hUMA is a great combo for high perf game audio!
‒ Maximized perf per W from specialized hardware (CPU + GPU + ACP)
‒ Our challenge is to figure out what to run where and when

 ACP is a great fit for codecs and OS services
‒ But not for modular synthesis and highly customized DSP pipelines

 GPU is great fit for mid/high latency DSP and high level 3D thin compute
‒ Indirect (reflected) audio
‒ Convolution reverb
‒ 3D ray casting for occlusion/obstruction/diffraction

 CPU is still the best fit for everything else:
‒ Open modular synthesis frameworks and middleware
‒ Low latency audio

10 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
AUDIO SYNTHESIZER SCHEDULING IN HSA

11 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
DISCLAIMER & ATTRIBUTION

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software
changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD
reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of
such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE
LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

ATTRIBUTION
© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names
are for informational purposes only and may be trademarks of their respective owners.
12 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL

Contenu connexe

Tendances

PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
AMD Developer Central
 

Tendances (20)

CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...
 
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...
 
Final lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tbFinal lisa opening_keynote_draft_-_v12.1tb
Final lisa opening_keynote_draft_-_v12.1tb
 
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
CE-4117, HSA Optimizations and Impact on end User Experiences for AfterShot P...
 
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman HashimMM-4099, Adapting game content to the viewing environment, by Noman Hashim
MM-4099, Adapting game content to the viewing environment, by Noman Hashim
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
 
GS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
GS-4139, RapidFire for Cloud Gaming, by Dmitry KozlovGS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
GS-4139, RapidFire for Cloud Gaming, by Dmitry Kozlov
 
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...
 
CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...
CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...
CC-4009, "Optimizing Hadoop Deployments with SeaMicro SM15000" by Satheesh Na...
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...
CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...
CE-4114, Screen Mirror, a unified screen mirroring solution that utilizes AMD...
 
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu FengHC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
HC-4022, Towards an Ecosystem for Heterogeneous Parallel Computing, by Wu Feng
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
HC-4020, Enhancing OpenCL performance in AfterShot Pro with HSA, by Michael W...
 
GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...
GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...
GS-4093, "AstoundSound for Gaming – The next dimension in the evolution of Au...
 
HSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian BrattHSA-4122, "HSA Queuing Mode," by Ian Bratt
HSA-4122, "HSA Queuing Mode," by Ian Bratt
 

En vedette

Android Game Plan and Benefit
Android Game Plan and BenefitAndroid Game Plan and Benefit
Android Game Plan and Benefit
Digitalmedia outsource Solution Co.,Ltd.
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentials
gamedevelopersturkey
 
Game Project / Working with Unity
Game Project / Working with UnityGame Project / Working with Unity
Game Project / Working with Unity
Petri Lankoski
 

En vedette (19)

FYP New
FYP NewFYP New
FYP New
 
Optimizing your Game for Low-end Devices
Optimizing your Game for Low-end DevicesOptimizing your Game for Low-end Devices
Optimizing your Game for Low-end Devices
 
Android Game Plan and Benefit
Android Game Plan and BenefitAndroid Game Plan and Benefit
Android Game Plan and Benefit
 
Audio Mixer in Unity5 - Andy Touch
Audio Mixer in Unity5 - Andy TouchAudio Mixer in Unity5 - Andy Touch
Audio Mixer in Unity5 - Andy Touch
 
LAFS PREPRO Session 7 - Game Audio and Levels
LAFS PREPRO Session 7 - Game Audio and LevelsLAFS PREPRO Session 7 - Game Audio and Levels
LAFS PREPRO Session 7 - Game Audio and Levels
 
Game Audio Post-Production
Game Audio Post-ProductionGame Audio Post-Production
Game Audio Post-Production
 
Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015
Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015
Hands On with the Unity 5 Game Engine! - Andy Touch - Codemotion Roma 2015
 
Optimizing Large Scenes in Unity
Optimizing Large Scenes in UnityOptimizing Large Scenes in Unity
Optimizing Large Scenes in Unity
 
Game Audio in Mobile Development
Game Audio in Mobile DevelopmentGame Audio in Mobile Development
Game Audio in Mobile Development
 
Mobile Game Development in Unity
Mobile Game Development in UnityMobile Game Development in Unity
Mobile Game Development in Unity
 
Problems and Solutions in Game Audio
Problems and Solutions in Game AudioProblems and Solutions in Game Audio
Problems and Solutions in Game Audio
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentials
 
Unite 2013 optimizing unity games for mobile platforms
Unite 2013 optimizing unity games for mobile platformsUnite 2013 optimizing unity games for mobile platforms
Unite 2013 optimizing unity games for mobile platforms
 
Practical guide to optimization in Unity
Practical guide to optimization in UnityPractical guide to optimization in Unity
Practical guide to optimization in Unity
 
Game Project / Working with Unity
Game Project / Working with UnityGame Project / Working with Unity
Game Project / Working with Unity
 
Optimizing mobile applications - Ian Dundore, Mark Harkness
Optimizing mobile applications - Ian Dundore, Mark HarknessOptimizing mobile applications - Ian Dundore, Mark Harkness
Optimizing mobile applications - Ian Dundore, Mark Harkness
 
How we optimized our Game - Jake & Tess' Finding Monsters Adventure
How we optimized our Game - Jake & Tess' Finding Monsters AdventureHow we optimized our Game - Jake & Tess' Finding Monsters Adventure
How we optimized our Game - Jake & Tess' Finding Monsters Adventure
 
Practical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on MobilesPractical Guide for Optimizing Unity on Mobiles
Practical Guide for Optimizing Unity on Mobiles
 
Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)Optimizing unity games (Google IO 2014)
Optimizing unity games (Google IO 2014)
 

Similaire à MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder

AMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory AccessAMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory Access
AMD
 
AMD AM1 Platform Presentation
AMD AM1 Platform PresentationAMD AM1 Platform Presentation
AMD AM1 Platform Presentation
Low Hong Chuan
 
AMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup AnnouncementAMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup Announcement
AMD
 
Chapter 02 audio recording - part ii
Chapter 02   audio recording - part iiChapter 02   audio recording - part ii
Chapter 02 audio recording - part ii
Nazihah Ahwan
 
Emebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentationEmebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentation
sampige
 

Similaire à MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder (20)

Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...
 
AMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory AccessAMD Heterogeneous Uniform Memory Access
AMD Heterogeneous Uniform Memory Access
 
Linux Audio Drivers. ALSA
Linux Audio Drivers. ALSALinux Audio Drivers. ALSA
Linux Audio Drivers. ALSA
 
AMD AM1 Platform Presentation
AMD AM1 Platform PresentationAMD AM1 Platform Presentation
AMD AM1 Platform Presentation
 
Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...
Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...
Full DDR Bank Power and Signal Integrity Analysis with Chip-Package-System Co...
 
Choosing the right processor
Choosing the right processorChoosing the right processor
Choosing the right processor
 
5 Things You Need to Know About Enterprise Fl
 5 Things You Need to Know About Enterprise Fl 5 Things You Need to Know About Enterprise Fl
5 Things You Need to Know About Enterprise Fl
 
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
Shoot the Bird: Linear Broadcast Distribution on AWS by Usman Shakeel of Amaz...
 
AMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUsAMD 2014 Performance Mobile APUs
AMD 2014 Performance Mobile APUs
 
P1 unit 2
P1 unit 2P1 unit 2
P1 unit 2
 
fpga1 - What is.pptx
fpga1 - What is.pptxfpga1 - What is.pptx
fpga1 - What is.pptx
 
Battlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + MantleBattlefield 4 + Frostbite + Mantle
Battlefield 4 + Frostbite + Mantle
 
The 2008 Pc Builders Bible
The 2008 Pc Builders BibleThe 2008 Pc Builders Bible
The 2008 Pc Builders Bible
 
AMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup AnnouncementAMD 2014 Mobility APU Lineup Announcement
AMD 2014 Mobility APU Lineup Announcement
 
Industry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solutionIndustry’s performance leading ultra low-power dsp solution
Industry’s performance leading ultra low-power dsp solution
 
Presentation sparc m6 m5-32 server technical overview
Presentation   sparc m6 m5-32 server technical overviewPresentation   sparc m6 m5-32 server technical overview
Presentation sparc m6 m5-32 server technical overview
 
Chapter 02 audio recording - part ii
Chapter 02   audio recording - part iiChapter 02   audio recording - part ii
Chapter 02 audio recording - part ii
 
Fixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP PlatformFixed-point Multi-Core DSP Platform
Fixed-point Multi-Core DSP Platform
 
Emebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentationEmebedded Memories from GF pb-emem presentation
Emebedded Memories from GF pb-emem presentation
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 

Plus de AMD Developer Central

Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
AMD Developer Central
 

Plus de AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

MM-4085, Designing a game audio engine for HSA, by Laurent Betbeder

  • 1. DESIGNING A GAME AUDIO ENGINE FOR HSA LAURENT BETBEDER SCEA
  • 2. WHAT’S SO SPECIAL ABOUT CONSOLE GAME DEV? NOW THAT CONSOLES MOSTLY RUN PC HARDWARE  Extreme performance optimizations ‒ Until gamers opt for shorter upgrade cycles (phones/tablets business model) ? ‒ Can’t run sub-optimal audio code when competing for cycles on crowded compute queues  Custom hardware, OS, drivers and compilers ‒ To extract max perf from fixed hardware ‒ Helps lengthening platform life time ‒ “But but… where’s my OpenCL runtime?”  Low latency ‒ Music games on consoles need it as much as professional music prod software on desktop ‒ But is much harder to achieve reliably when a system is constantly overloaded 2 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 3. GAME AUDIO DSP ON THE ACP WHY?  Heavy specialized DSP workloads ‒ Stuff games need badly but don’t really want to deal with ‒ Best fit for dedicated and/or fixed function hardware ‒ Codecs ‒ ‒ ‒ ‒ CELP codecs -> party chat 100s of MP3/AT9/AAC decode instances Huge impact on game assets footprint, down/load times Optional output bitstream encoding (AC3/DTS) ‒ Voice recognition ‒ Echo cancelation  Platform wide IP licensing levels the playing field ‒ Good for indy developers ‒ And good for the platform!  Available via asynchronous secure system APIs 3 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 4. GAME AUDIO DSP ON THE ACP WHY NOT?  Exotic hardware and dev environment ‒ Closed to games ‒ Closed to middleware ‒ Platform specific  Asynchronous interface ‒ Can’t have sequential interleaving of DSP back and forth between CPU and ACP w/o latency buildup ‒ But ultimately, we want the DSP pipeline to be data driven (by artists who know nothing about this) ‒ Modularity  Slow clock rate @ 800MHz, very limited SIMD and no FP support ‒ Tough sell against Jaguar for many DSP algorithms ‒ Very tight local memory shared by multiple DSP cores  Already pretty busy with codec loads and system tasks 4 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 5. GAME AUDIO DSP ON THE GPU WHY?  Much more demand for real-time effects today and will keep growing  CPU FLOPS likely to stagnate and could even decline in HSA as CUs takes over SIMD workloads  Flexibility: some games are CPU bound, others are GPU bound…  hUMA is a game changer (removes NUMA’s main bottleneck: GPU write back)  Compute queues with prioritized scheduling and even some form of preemption  Many real-time audio DSP algorithms work well on wide SIMD units ‒ FFT convolution (spectral processing in general) ‒ Mixing, resampling, wave shaping, etc…  Mostly coalesced mem accesses  Low/med bandwidth (< 1GB/s) 5 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 6. GAME AUDIO DSP ON THE GPU WHY NOT?  Some algorithms do not work (as) well on wide SIMD units ‒ IIR filters, ADPCM decodes, dynamics: data recursion causes thread interdependencies within wavefronts ‒ Typical AAA game runs 1000s of biquads at various stages in the filtergraph  Workloads may require batch voice processing to achieve high CU efficiency ‒ Build 2D grids (channels x samples) or 3D grids (channels x subbands x samples) ‒ Swizzling is key but watch out for runtime cost as SIMD widens (static vs dynamic)  Batch processing goes against free form MaxMSP model artists are pushing for ‒ Unique DSP chain for each sound “just because we can!” ‒ Data driven filtergraph and DSP pipeline  Complex prioritized scheduling & dispatching compute queues ‒ Do not prevent intermittent CU saturation caused by large graphics workloads ‒ Risky for low latency direct path audio DSP  Proprietary hardware, drivers and shader compilers (PSSL) ‒ Audio middleware will need a some incentive to move up there ‒ Most will probably stay on the CPU 6 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 7. GAME AUDIO DSP ON JAGUAR WHY?  Well known and open x64 dev environment ‒ Middleware friendly ‒ CLANG/LLVM solid & stable  Full FP unit with SSE4 support  Early PA is surprisingly good for compiled intrinsics code ‒ ~10% slower than core i7 @ same clock rate ‒ GDDR5 latency is not an issue ‒ < ~50% of 1 core @ 1.6GHz running the entire KZSF filtergraph  Only reliable solution for ultra low latency ‒ Music and rhythm games ‒ Run 100% on CPU (including decoding) 7 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 8. GAME AUDIO DSP ON JAGUAR WHY NOT?  “Weak laptop CPU” compared to top of the line on desktop ‒ No FMA4 ‒ Slow clock @ 1.6GHz (compared to typical desktop)  256bit AVX mostly useless  Possible bottleneck down the line 8 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 9. GAME ENGINE CODE THIN COMPUTE  3D audio ‒ Sound emitters (distance, directionality and size modeling) ‒ Sound listeners (mic and ear modeling) ‒ Sound geometry (collision meshes) ‒ Deeper physical modeling of sound propagation ‒ Simple ray casting (occlusion, obstruction, indirect audio) ‒ Advanced ray casting (diffraction, real-time individual early reflection tracking)  Physics ‒ Rigid body dynamics (collisions, friction, destruction) ‒ Fluid dynamics (turbulences)  Animation, special FX ‒ Inline audio sequencing and modulation ‒ Foley, coarse granular synthesis 9 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 10. CONCLUSIONS  HSA + hUMA is a great combo for high perf game audio! ‒ Maximized perf per W from specialized hardware (CPU + GPU + ACP) ‒ Our challenge is to figure out what to run where and when  ACP is a great fit for codecs and OS services ‒ But not for modular synthesis and highly customized DSP pipelines  GPU is great fit for mid/high latency DSP and high level 3D thin compute ‒ Indirect (reflected) audio ‒ Convolution reverb ‒ 3D ray casting for occlusion/obstruction/diffraction  CPU is still the best fit for everything else: ‒ Open modular synthesis frameworks and middleware ‒ Low latency audio 10 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 11. AUDIO SYNTHESIZER SCHEDULING IN HSA 11 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL
  • 12. DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners. 12 | PRESENTATION TITLE | NOVEMBER 19, 2013 | CONFIDENTIAL