SlideShare une entreprise Scribd logo
1  sur  90
/* * SPU Assisted Rendering. */ Steven Tovey & Stephen McAuley Graphics Programmers, Bizarre Creations Ltd. steven.tovey@bizarrecreations.com stephen.mcauley@bizarrecreations.com http://www.bizarrecreations.com
/* Welcome! */ ,[object Object],[object Object]
Car Lighting
Part II (w/ Stephen McAuley):Fragment Shading Parallelisation Case Study Pre-pass Lighting on SPUs ,[object Object],/* Agenda */
/* * Part I w/ Steven Tovey  */ SPU Acceleration of Car Rendering in Blur
[object Object]
Why do this?Free up RSX™ to do other things. Enable otherwise unfeasible techniques. Optimise rendering. /* What is SPU AR? I */
[object Object]
Synchronisation.
Optimising SPU modules.
Memory considerations:
Local store
Resource allocation
Etc./* What is SPU AR? II */
[object Object]
Totally GPU-based.
2xVTF (volume & 2D) for damage.
Large amount of work in vertex shader, making cars in Blur heavily vertex-bound.
All lighting in pixel shader./* Case Study: Cars I */
[object Object],/* Case Study: Cars II */
[object Object],/* Case Study: Cars III */
[object Object],/* Case Study: Cars IV */
[object Object],/* Case Study: Cars IV */
[object Object]
Increase rendering speed of cars.
Maintain same quality./* Case Study: Cars VI */
[object Object]
Large parts are SPU based.
On demand.
Sync-free.
Deferred.
Work split between GPU/SPU./* Damage: Solution */
[object Object]
Read-only car vertex data.
Shared between similar cars.
SPU-modified damage vertex data.
Per instance.
One-to-one mapping of vertices.
Control points:
Crude approximation of volume preservation.
Dent/scratch blend levels./* Damage: Data I */
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
/* Damage: Data II */ Stream0 Stream1 Position SPU_Position Normal UV0 SPU_Normal UV1 PosOffset NormalOffset ControlPoints AO
[object Object]
If vertex format is 16 bytes exactly can atomically change a vertex from SPU.
If you can live with the odd vertex being wrong for a frame, this could be a huge win!/* Damage: Data III */
/* Damage: Data IV */ SPU RSX Local Main Write-only Vertices Read-only Vertices
[object Object]
Note: There is no link to the player health, purely superficial./* Damage: Events */ Impact Impact Game Code Impact Impact Impact Impact
/* Damage: Data V */ Impact Impact Impact Constants Impact Impact Impact SPU GPU Write-only Vertices* Read-only Vertices* * - w.r.t to SPU
/* Damage: Data VI */ SPU GPU Write-only Vertices* Read-only Vertices* * - w.r.t to SPU
Kick off SPU tasks ,[object Object],/* Damage: Control */ Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Other Work(1) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(2) Other Work(1) PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(2) Other Work(1) PPU Damage PPU Damage
[object Object],/* Damage: Control */ Flag Vertex Work Vertex Work Vertex Work Vertex Work Other Work(1) Other Work(2) Other Work(1) PPU Damage PPU Damage
[object Object]
We favour si style for simplicity and ease./* de-code into IEEE754-ish 32bit float (meh): */ qword sign_bit     = si_and(result, sign_bit_mask); sign_bit     = si_shli(sign_bit, 0x10);      /* move 16 bits into correct place. */ qword significand  = si_and(result, mant_bit_mask); significand  = si_shli(significand, 0xd); qword is_zero_mask = si_cgti(significand, 0x0);    /* all bits set if non-zero. */ expo_bias	   = si_and(is_zero_mask, expo_bias); qword exponent_bias= si_a(significand, expo_bias); /* move expo up range, 						     0x07800000=>0x3f800000. */ exponent_bias= si_or(exponent_bias, sign_bit); /* Damage: SPU I */
[object Object]
GPU version relied on bilinear filtering of volume texture to smooth damage.
Filtering on SPU is a bit of a pain.
Working out which events affect which vertices?/* Damage: SPU II */
[object Object]
Two-stage x-form:
1. Get data in volume texture-ish format.
2. Apply x-form to all vertices./* Damage: SPU III */
[object Object]
Software bilinear filtering.
Some interesting instructions in ISA will help here./* Damage: SPU IV */
[object Object],Process in 16KB chunks. Multi-buffer input and output. ,[object Object],/* Damage: Lessons I */
[object Object],/* Damage: Lessons II */ x y z w x x x x x y z w y y y y x y z w z z z z x y z w w w w w
[object Object]
We added some of the per-vertex lighting calculations for brake lights, for example./* Damage: Lessons III */
/* Damage: Results */
[object Object]
SPU-generated cube maps.
40 in total (accounting for double buffer).
8x8 per face.
Deferred.
Work split between GPU/SPU.
Cars are lit with a mixture of things:
SH (world + dynamic)
Cube map lighting

Contenu connexe

Tendances

Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-renderingmistercteam
 
Modern Graphics Pipeline Overview
Modern Graphics Pipeline OverviewModern Graphics Pipeline Overview
Modern Graphics Pipeline Overviewslantsixgames
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Johan Andersson
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...Johan Andersson
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11smashflt
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeWuBinbo
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteElectronic Arts / DICE
 
Deferred shading
Deferred shadingDeferred shading
Deferred shadingFrank Chao
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteElectronic Arts / DICE
 
Clustered defered and forward shading
Clustered defered and forward shadingClustered defered and forward shading
Clustered defered and forward shadingWuBinbo
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Johan Andersson
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringElectronic Arts / DICE
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologyTiago Sousa
 
Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Tiago Sousa
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Tiago Sousa
 
Star Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processingStar Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processingumsl snfrzb
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Johan Andersson
 
Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3stevemcauley
 
5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive Rendering5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive RenderingElectronic Arts / DICE
 

Tendances (20)

Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
 
Modern Graphics Pipeline Overview
Modern Graphics Pipeline OverviewModern Graphics Pipeline Overview
Modern Graphics Pipeline Overview
 
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
 
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...
 
GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11GDC 2012: Advanced Procedural Rendering in DX11
GDC 2012: Advanced Procedural Rendering in DX11
 
Rendering Battlefield 4 with Mantle
Rendering Battlefield 4 with MantleRendering Battlefield 4 with Mantle
Rendering Battlefield 4 with Mantle
 
Optimizing the graphics pipeline with compute
Optimizing the graphics pipeline with computeOptimizing the graphics pipeline with compute
Optimizing the graphics pipeline with compute
 
High Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in FrostbiteHigh Dynamic Range color grading and display in Frostbite
High Dynamic Range color grading and display in Frostbite
 
Deferred shading
Deferred shadingDeferred shading
Deferred shading
 
FrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in FrostbiteFrameGraph: Extensible Rendering Architecture in Frostbite
FrameGraph: Extensible Rendering Architecture in Frostbite
 
Clustered defered and forward shading
Clustered defered and forward shadingClustered defered and forward shading
Clustered defered and forward shading
 
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
 
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal FilteringStable SSAO in Battlefield 3 with Selective Temporal Filtering
Stable SSAO in Battlefield 3 with Selective Temporal Filtering
 
Secrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics TechnologySecrets of CryENGINE 3 Graphics Technology
Secrets of CryENGINE 3 Graphics Technology
 
Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3Anti-Aliasing Methods in CryENGINE 3
Anti-Aliasing Methods in CryENGINE 3
 
Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666Siggraph2016 - The Devil is in the Details: idTech 666
Siggraph2016 - The Devil is in the Details: idTech 666
 
Star Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processingStar Ocean 4 - Flexible Shader Managment and Post-processing
Star Ocean 4 - Flexible Shader Managment and Post-processing
 
Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!Your Game Needs Direct3D 11, So Get Started Now!
Your Game Needs Direct3D 11, So Get Started Now!
 
Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3Calibrating Lighting and Materials in Far Cry 3
Calibrating Lighting and Materials in Far Cry 3
 
5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive Rendering5 Major Challenges in Interactive Rendering
5 Major Challenges in Interactive Rendering
 

Similaire à SPU Assisted Rendering

CUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : NotesCUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : NotesSubhajit Sahu
 
The mag pi-issue-28-en
The mag pi-issue-28-enThe mag pi-issue-28-en
The mag pi-issue-28-enNguyen Nam
 
Vision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot NavigationVision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot NavigationNiaz Mohammad
 
Scottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RADScottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RADlostcaggy
 
Virtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - RomagnoliVirtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - RomagnoliCodemotion
 
Ch_2_8,9,10.pptx
Ch_2_8,9,10.pptxCh_2_8,9,10.pptx
Ch_2_8,9,10.pptxyosikit826
 
ARUDINO UNO and RasberryPi with Python
 ARUDINO UNO and RasberryPi with Python ARUDINO UNO and RasberryPi with Python
ARUDINO UNO and RasberryPi with PythonJayanthi Kannan MK
 
OV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontrollerOV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontrollerSomnath Sharma
 
Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)Macpaul Lin
 
Lab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdfLab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdfssuser0e9cc4
 
TP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdfTP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdfkiiway01
 
Game Programming I - Introduction
Game Programming I - IntroductionGame Programming I - Introduction
Game Programming I - IntroductionFrancis Seriña
 
2 Level Guitar Hero Final Report
2 Level Guitar Hero Final Report2 Level Guitar Hero Final Report
2 Level Guitar Hero Final ReportCem Recai Çırak
 
Syed IoT - module 5
Syed  IoT - module 5Syed  IoT - module 5
Syed IoT - module 5Syed Mustafa
 
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdfAdvanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdfWiseNaeem
 
Android Things Linux Day 2017
Android Things Linux Day 2017 Android Things Linux Day 2017
Android Things Linux Day 2017 Stefano Sanna
 
GDGPH Hack Fair Presentation
GDGPH Hack Fair PresentationGDGPH Hack Fair Presentation
GDGPH Hack Fair PresentationMithi Sevilla
 
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter boardSerial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter boardVincent Claes
 

Similaire à SPU Assisted Rendering (20)

CUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : NotesCUDA by Example : Constant Memory and Events : Notes
CUDA by Example : Constant Memory and Events : Notes
 
The mag pi-issue-28-en
The mag pi-issue-28-enThe mag pi-issue-28-en
The mag pi-issue-28-en
 
How to Hack Edison
How to Hack EdisonHow to Hack Edison
How to Hack Edison
 
Vision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot NavigationVision Based Autonomous Mobile Robot Navigation
Vision Based Autonomous Mobile Robot Navigation
 
Scottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RADScottish Ruby Conference 2010 Arduino, Ruby RAD
Scottish Ruby Conference 2010 Arduino, Ruby RAD
 
Graphics processing unit
Graphics processing unitGraphics processing unit
Graphics processing unit
 
Virtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - RomagnoliVirtual Reality & Sim Racing in Assetto Corsa - Romagnoli
Virtual Reality & Sim Racing in Assetto Corsa - Romagnoli
 
Ch_2_8,9,10.pptx
Ch_2_8,9,10.pptxCh_2_8,9,10.pptx
Ch_2_8,9,10.pptx
 
ARUDINO UNO and RasberryPi with Python
 ARUDINO UNO and RasberryPi with Python ARUDINO UNO and RasberryPi with Python
ARUDINO UNO and RasberryPi with Python
 
OV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontrollerOV7670 Camera interfacing-with-arduino-microcontroller
OV7670 Camera interfacing-with-arduino-microcontroller
 
Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)Bootstrap process of u boot (NDS32 RISC CPU)
Bootstrap process of u boot (NDS32 RISC CPU)
 
Lab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdfLab 2_ Programming an Arduino.pdf
Lab 2_ Programming an Arduino.pdf
 
TP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdfTP_Webots_7mai2021.pdf
TP_Webots_7mai2021.pdf
 
Game Programming I - Introduction
Game Programming I - IntroductionGame Programming I - Introduction
Game Programming I - Introduction
 
2 Level Guitar Hero Final Report
2 Level Guitar Hero Final Report2 Level Guitar Hero Final Report
2 Level Guitar Hero Final Report
 
Syed IoT - module 5
Syed  IoT - module 5Syed  IoT - module 5
Syed IoT - module 5
 
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdfAdvanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
Advanced View of Atmega Microcontroller Projects List - ATMega32 AVR.pdf
 
Android Things Linux Day 2017
Android Things Linux Day 2017 Android Things Linux Day 2017
Android Things Linux Day 2017
 
GDGPH Hack Fair Presentation
GDGPH Hack Fair PresentationGDGPH Hack Fair Presentation
GDGPH Hack Fair Presentation
 
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter boardSerial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
Serial Communication in LabVIEW FPGA on Xilinx Spartan 3E Starter board
 

Dernier

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Dernier (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

SPU Assisted Rendering

Notes de l'éditeur

  1. Local space position of vertex
  2. Normal
  3. Couple of sets of Uvs.
  4. Morph-targets.
  5. Control point index into an array of curves.
  6. Spherical Harmonic
  7. Damage data... Position offset, Normal offset, scratch and dent levels.
  8. Explain why 16KB chunks, MFC max per transfer.
  9. Don’t want lumpiness if parallel read/write.
  10. Don’t want lumpiness if parallel read/write.
  11. Rim lighting here.
  12. Tyres use low-power specular.
  13. Brake lights.
  14. Used on alloys for low-power specular.
  15. Used for the scratch lighting, again for low-power specular.
  16. We need to look at the pipeline of the graphics card to work out how we can move more of our GPU work onto the SPUs. Two main areas we can insert data – either through vertices at the top, or textures at the fragment stage. Sadly, we can’t hook into the rasteriser, which would be ace.
  17. Of course, these look-up textures end up being screen-space look-up textures, which means some sort of deferred rendering…
  18. I have a problem with forward rendering. I think most people traditionally design their engine this way, especially on 360 and PC. But all the work is done in the fragment shader, so when you port to the PS3 with a slower fragment shader unit, your whole game runs slower. Although you can use EDGE to speed up your vertex processing and your post processing, they both only step around the core of the issue that you’re fragment shader bound and there’s no easy way of solving it.
  19. We found a light pre-pass renderer suited our goals pretty well. It’s a halfway house between traditional and deferred rendering.
  20. We render a rear-view mirror, cube map reflections for the cars and planar reflections for the road and water in addition to the pre-pass and main views. Multi-threaded rendering helps a lot!
  21. Deferring by a frame isn’t ideal. Either you just use the previous frame’s lighting buffer for the next frame, with obvious artefacts (especially if you’re doing a racing game like us), or you have to add a frame of latency.I don’t think adding frames of latency is ideal, especially for cross-platform games. If you add a frame of latency on the PS3, are you going to do the same on the 360? If you’re not, then game play could be different between both platforms.I’m not saying this is something I’d never do, I think in lots of circumstances you’ll have to. But avoid it where you can, and this is one instance.
  22. If we wanted to take this further for future projects, we could add shadow maps in at the start of our pipeline, then do an exponential blur on the SPUs whilst we’re rendering the pre-pass geometry…
  23. This is real multi-threaded graphics processing, with multiple processors doing different jobs at the same time. Therefore, architect your engine accordingly!Having small graphics jobs allows you to spread the workload. Obviously, not everything can be done like this. Some things will most likely have to be deferred a frame, adding a frame of latency, such as post-processing or MLAA. But there’s lots of tasks, smaller tasks, that don’t have to be, from SSAO to blurring exponential shadow maps. You have to find things to parallelise with!Think about the data again! Rendering has lots of stages, each with its own inputs and outputs. What could sync with what?
  24. We combine the normals and depth into one 32-bit buffer. This is an optimisation as it halves the inputs into the SPU program, but also allows us to keep the depth buffer in local memory which is good for performance.
  25. The first step, but the biggest stumbling block!
  26. No blocking! Our jobs are optionally dependent on a label.
  27. To be accurate, we have a jump-to-self per SPU.
  28. When we load in a tile, we quickly iterate over every pixel and calculate minimum and maximum depth.No need to use a stencil buffer to cull out the sky as depth min and max will do it for us. (Remember, we don’t have the stencil buffer as we’re not using the depth buffer!)This technique is really useful for a variety of things, including depth of field (check out Matt Swoboda’s optimisation in PhyreEngine).
  29. This is actually the easiest bit. Just write the lighting equations in intrinsics! However, they really have to be fast otherwise performance just won’t be good enough. Next is some helpful tips for optimisation.
  30. So we triple buffer. It ends up that we have plenty of local store left as it’s simple job and our job size was relatively small. Another reason to write in siintrinsics though as it keeps the code size down!
  31. Just like Ste said earlier, this is a big win. Probably a good rule of thumb for most SPU jobs!
  32. Just like Ste said earlier, this is a big win. Probably a good rule of thumb for most SPU jobs!
  33. When kicking SPU jobs off on the RSX, you have to be careful as you can interfere with jobs the PPU is running. This is where sync-free systems are a win! We’re lucky as we just avoided the physics, but also, running only on 3 SPUs was a good idea so we had 3 free for other tasks. See how quick the rendering is even though we’re rendering so many views!
  34. Apologies for the shameless self-promotion!