SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
CUDA-based Linear Solvers for Stable
Fluids
G. Amador and A. Gomes
Departamento de Inform´atica
Universidade da Beira Interior
Covilh˜a, Portugal
m1420@ubi.pt, agomes@di.ubi.pt
April, 2010
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
1 Introduction
2 Stable Fluids
The Eulerian approach
Physics Model
3 NVIDIA Compute Unified Device Architecture (CUDA)
Workflow
Iterative solvers
Jacobi
Gauss-Seidel red-black
Conjugate gradient
4 Results
Jacobi performance
Gauss-Seidel performance
Conjugate gradient performance
5 Conclusions
Conclusions
Future Work
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Overview
The study of fluid simulation (e.g., water) is important
for two industries:
(real-time ≥ 30 fps) (off-line ≤ 30 fps)
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Overview
The study of fluid simulation (e.g., water) is important
for two industries:
(real-time ≥ 30 fps) (off-line ≤ 30 fps)
Problems:
How to implement (specifically for 3D stable fluids) the
CUDA-based versions of the Jacobi, Gauss-Seidel,
and conjugate gradient iterative solvers?
What are the real-time performance limitations of
these solvers implementations?
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
The Eulerian approach
The Eulerian approach
Space partitioning:
Variations of velocity and density are observed at the
center of each cell.
Velocities and densities are updated through an im-
plicit method (Stam stable fluids, 1999), i.e., uncondi-
tionally stable for any time step.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations for incompressible fluids
Mass conservation: −→
u = 0
Velocity evolution:
∂
−→
u
∂t
= −
−→
u ·
−→
u + v 2−→
u +
−→
f
Density evolution:
∂ρ
∂t
= −
−→
u · ρ + k 2
ρ + S
−→
u : velocity field.
v: fluids viscosity.
ρ: density of the field.
k: density diffusion rate.
−→
f : external forces added to the velocity field.
S: external sources added to the density field.
=
∂
∂x
,
∂
∂y
,
∂
∂z
: gradient.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations implementation
Update velocity:
Add external forces (
−→
f ).
Velocity Diffusion (v 2−→
u ).
Move (−
−→
u .
−→
u e
−→
u = 0).
Update density:
Add external sources (S).
Density advection (−
−→
u . ρ).
Density diffusion (k 2
ρ).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations implementation
Update velocity:
Add external forces (
−→
f ).
Velocity Diffusion (v 2−→
u ).
Move (−
−→
u .
−→
u e
−→
u = 0).
Update density:
Add external sources (S).
Density advection (−
−→
u . ρ).
Density diffusion (k 2
ρ).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Diffusion
Exchanges of density
or velocity between
neighbours (2D).
Solve a sparse linear system (Ax = b), using an iter-
ative method (e.g., Jacobi, Gauss-Seidel, conjugate
gradient, etc.).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Move
Ensure mass conservation and the fluid’s incom-
pressibility.
Hodge decomposition:
Conservative field = our field - gradient
Determine the gradient using diffusion’s iterative
method (e.g., Jacobi, Gauss-Seidel, conjugate gradi-
ent, etc.).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Workflow
Workflow
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Jacobi
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Gauss-Seidel red-black
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Conjugate gradient
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Jacobi performance
Jacobi performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Gauss-Seidel performance
Gauss-Seidel performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Conjugate gradient performance
Conjugate gradient performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Conclusions
Conclusions
The CUDA-based implementation of the Gauss-
Seidel solver allows more iterations than the CPU-
based implementation, however it converges two
times slower.
The CUDA-based implementations of the Jacobi and
Gauss-Seidel iterative solvers achieved better perfor-
mances (i.e. faster in processing time) than the CPU-
based implementations.
The CUDA-based implementation of the conjugate
gradient, for grid sizes superior to 643, due to global
memory latency, performs worst than the CPU-based
version.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Future Work
Future Work
Search ways, implementable using CUDA, to reduce
global memory accesses (e.g., data structures, dy-
namic memory, etc.).
Implement the CPU-based multi-core versions of
the solvers and compare their performance with the
CUDA-based versions.
Search new solvers implementable using CUDA, with
better convergence rate than relaxation techniques
(Jacobi and Gauss-Seidel), with no significant extra
computational effort such as the conjugate gradient.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Future Work
Questions???

Contenu connexe

En vedette

Schulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDASchulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDAJörn Dinkla
 
GPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel ApplicationsGPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel ApplicationsMarcos Gonzalez
 
PL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database AnalyticsPL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database AnalyticsKohei KaiGai
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
MUE 2011 Conference Presentation
MUE 2011 Conference PresentationMUE 2011 Conference Presentation
MUE 2011 Conference PresentationGonçalo Amador
 

En vedette (7)

Schulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDASchulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDA
 
CUDA-Aware MPI
CUDA-Aware MPICUDA-Aware MPI
CUDA-Aware MPI
 
GPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel ApplicationsGPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel Applications
 
PL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database AnalyticsPL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database Analytics
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
MUE 2011 Conference Presentation
MUE 2011 Conference PresentationMUE 2011 Conference Presentation
MUE 2011 Conference Presentation
 
Presentation visapp
Presentation visappPresentation visapp
Presentation visapp
 

Similaire à ICISA 2010 Conference Presentation

A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionFederico Magliani
 
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaCFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaMario Felipe Campuzano Ochoa
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10Jody Garnett
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DLLeapMind Inc
 
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesSubhajit Sahu
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemCDVClub
 
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesEvolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesDatabricks
 
Accelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdfAccelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdfLiang Yan
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...NVIDIA Taiwan
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Clusterairbots
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA Japan
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsGanesan Narayanasamy
 
Converter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation BoardConverter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation BoardAnalog Devices, Inc.
 
2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkioLeo Benkel
 
Speed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorSpeed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorDatabricks
 
CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019NVIDIA
 
the usefulness of ldope tools
the usefulness of ldope toolsthe usefulness of ldope tools
the usefulness of ldope toolsliang0816
 
Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.Vasant Bhanushali
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with ScyllaMeeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with ScyllaScyllaDB
 

Similaire à ICISA 2010 Conference Presentation (20)

A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognition
 
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaCFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DL
 
Cuda
CudaCuda
Cuda
 
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : Notes
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemC
 
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesEvolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming Pipelines
 
Accelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdfAccelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdf
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Cluster
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
 
Converter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation BoardConverter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation Board
 
2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio
 
Speed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorSpeed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS Accelerator
 
CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019
 
the usefulness of ldope tools
the usefulness of ldope toolsthe usefulness of ldope tools
the usefulness of ldope tools
 
Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with ScyllaMeeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with Scylla
 

Dernier

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 

Dernier (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 

ICISA 2010 Conference Presentation

  • 1. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions CUDA-based Linear Solvers for Stable Fluids G. Amador and A. Gomes Departamento de Inform´atica Universidade da Beira Interior Covilh˜a, Portugal m1420@ubi.pt, agomes@di.ubi.pt April, 2010
  • 2. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions 1 Introduction 2 Stable Fluids The Eulerian approach Physics Model 3 NVIDIA Compute Unified Device Architecture (CUDA) Workflow Iterative solvers Jacobi Gauss-Seidel red-black Conjugate gradient 4 Results Jacobi performance Gauss-Seidel performance Conjugate gradient performance 5 Conclusions Conclusions Future Work
  • 3. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Overview The study of fluid simulation (e.g., water) is important for two industries: (real-time ≥ 30 fps) (off-line ≤ 30 fps)
  • 4. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Overview The study of fluid simulation (e.g., water) is important for two industries: (real-time ≥ 30 fps) (off-line ≤ 30 fps) Problems: How to implement (specifically for 3D stable fluids) the CUDA-based versions of the Jacobi, Gauss-Seidel, and conjugate gradient iterative solvers? What are the real-time performance limitations of these solvers implementations?
  • 5. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions The Eulerian approach The Eulerian approach Space partitioning: Variations of velocity and density are observed at the center of each cell. Velocities and densities are updated through an im- plicit method (Stam stable fluids, 1999), i.e., uncondi- tionally stable for any time step.
  • 6. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations for incompressible fluids Mass conservation: −→ u = 0 Velocity evolution: ∂ −→ u ∂t = − −→ u · −→ u + v 2−→ u + −→ f Density evolution: ∂ρ ∂t = − −→ u · ρ + k 2 ρ + S −→ u : velocity field. v: fluids viscosity. ρ: density of the field. k: density diffusion rate. −→ f : external forces added to the velocity field. S: external sources added to the density field. = ∂ ∂x , ∂ ∂y , ∂ ∂z : gradient.
  • 7. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations implementation Update velocity: Add external forces ( −→ f ). Velocity Diffusion (v 2−→ u ). Move (− −→ u . −→ u e −→ u = 0). Update density: Add external sources (S). Density advection (− −→ u . ρ). Density diffusion (k 2 ρ).
  • 8. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations implementation Update velocity: Add external forces ( −→ f ). Velocity Diffusion (v 2−→ u ). Move (− −→ u . −→ u e −→ u = 0). Update density: Add external sources (S). Density advection (− −→ u . ρ). Density diffusion (k 2 ρ).
  • 9. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Diffusion Exchanges of density or velocity between neighbours (2D). Solve a sparse linear system (Ax = b), using an iter- ative method (e.g., Jacobi, Gauss-Seidel, conjugate gradient, etc.).
  • 10. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Move Ensure mass conservation and the fluid’s incom- pressibility. Hodge decomposition: Conservative field = our field - gradient Determine the gradient using diffusion’s iterative method (e.g., Jacobi, Gauss-Seidel, conjugate gradi- ent, etc.).
  • 11. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Workflow Workflow
  • 12. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Iterative solvers Jacobi
  • 13. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Iterative solvers Gauss-Seidel red-black
  • 14. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Iterative solvers Conjugate gradient
  • 15. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Jacobi performance Jacobi performance
  • 16. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Gauss-Seidel performance Gauss-Seidel performance
  • 17. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Conjugate gradient performance Conjugate gradient performance
  • 18. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Conclusions Conclusions The CUDA-based implementation of the Gauss- Seidel solver allows more iterations than the CPU- based implementation, however it converges two times slower. The CUDA-based implementations of the Jacobi and Gauss-Seidel iterative solvers achieved better perfor- mances (i.e. faster in processing time) than the CPU- based implementations. The CUDA-based implementation of the conjugate gradient, for grid sizes superior to 643, due to global memory latency, performs worst than the CPU-based version.
  • 19. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Future Work Future Work Search ways, implementable using CUDA, to reduce global memory accesses (e.g., data structures, dy- namic memory, etc.). Implement the CPU-based multi-core versions of the solvers and compare their performance with the CUDA-based versions. Search new solvers implementable using CUDA, with better convergence rate than relaxation techniques (Jacobi and Gauss-Seidel), with no significant extra computational effort such as the conjugate gradient.
  • 20. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Future Work Questions???