SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Introduction
What is GPU?
• It is a processor optimized for 2D/3D graphics, video,
visual computing, and display.
• It is highly parallel, highly multithreaded multiprocessor
optimized for visual computing.
• It provide real-time visual interaction with
computed objects via graphics images, and
video.
• It serves as both a programmable graphics
processor and a scalable parallel computing
platform.
GPU Evolution
• 1980’s – No GPU. PC used VGA controller
• 1990’s – Add more function into VGA controller
• 1997 – 3D acceleration functions:

Hardware for triangle setup and rasterization
Texture mapping
Shading
• 2000 – A single chip graphics processor ( beginning of GPU
term)
• 2005 – Massively parallel programmable processors
• 2007 – CUDA (Compute Unified Device Architecture)
GPU Graphic Trends
• OpenGL – an open standard for 3D programming
• DirectX – a series of Microsoft multimedia programming
•
•
•
•
•
•
•

interfaces
New GPU are being developed every 12 to 18 months
New idea of visual computing:
combines graphics processing and parallel computing
Heterogeneous System – CPU + GPU
GPU evolves into scalable parallel processor
GPU Computing: GPGPU and CUDA
GPU unifies graphics and computing
GPU visual computing application: OpenGL, and DirectX
GPU System Architectures
• CPU-GPU system architecture

– The Historical PC
– contemporary PC with Intel and AMD CPUs
• Graphics Logical Pipeline
• Basic Unified GPU Architecture
– Processor Array
Historical PC

FIGURE A.2.1 Historical PC. VGA controller drives graphics display from framebuffer memory. Copyright © 2009
Elsevier, Inc. All rights reserved.
Intel and AMD CPU

FIGURE A.2.2 Contemporary PCs with Intel and AMD CPUs. See Chapter 6 for an explanation of the components and
interconnects in this figure. Copyright © 2009 Elsevier
Graphics Logical Pipeline

FIGURE A.2.3 Graphics logical pipeline. Programmable graphics shader stages are blue, and fixed-function blocks are white.
Copyright © 2009 Elsevier, Inc. All rights reserved.
Basic Unified GPU Architecture

FIGURE A.2.4 Logical pipeline mapped to physical processors. The programmable shader stages execute on the
array of unified processors, and the logical graphics pipeline dataflow recirculates through the processors. Copyright ©
2009 Elsevier, Inc. All rights reserved.
Processor Array

FIGURE A.2.5 Basic unified GPU architecture. Example GPU with 112 streaming processor (SP) cores organized in
14 streaming multiprocessors (SMs); the cores are highly multithreaded. It has the basic Tesla architecture of an NVIDIA
GeForce 8800. The processors connect with four 64-bit-wide DRAM partitions via an interconnection network. Each SM
has eight SP cores, two special function units (SFUs), instruction and constant caches, a multithreaded instruction unit,
and a shared memory. Copyright © 2009 Elsevier, Inc. All rights reserved.
Compare CPU and GPU
Nemo-3D
• Written by the CalTech Jet Propulsion Laboratory
• NEMO-3D simulates quantum phenomena.
• These models require a lot of matrix operations on very large
matrices.
• We are modifying the matrix operation functions so they use
CUDA instead of that slow CPU.
Nemo-3D
Simulation

NEMO-3D
Computation
Module
CUDA
kernel

Visualization

VolQD
Testing - Matrices
• Test the multiplication of two matrices.
• Creates two matrices with random floating point values.
• We tested with matrices of various dimensions…
Results:
DimTime
64x64

CUDA

CPU
0.417465 ms

18.0876 ms

128x128

0.41691 ms

18.3007 ms

256x256

2.146367 ms

145.6302 ms

512x512

8.093004 ms

1494.7275 ms

768x768

25.97624 ms

4866.3246 ms

1024x1024

52.42811 ms

66097.1688 ms

2048x2048

407.648 ms Didn’t finish

4096x4096

3.1 seconds Didn’t finish
In visible terms:
70000

CUDA

CPU

CPU regression

CPU versus GPU

CUDA regression
0.0085x

y = 10.682e
R2 = 0.9813

Execution time (ms)

60000
50000
40000
30000
20000

0.0053x

y = 0.3526e
2
R = 0.9575

10000
0
0

200

400

600

800

Matrix side dimension

1000

1200
Test results:
Function Execute Time
500

y = 0.0228x - 0.5522
R2 = 1

450

Milliseconds

400
350

CUDA

300

CPU

250

CPU trendline

200

CUDA trendline

150
100

y = 0.0015x - 0.3449
R2 = 0.9996

50
0
0

5000

10000

15000

Number of Atoms

20000

25000

Contenu connexe

Tendances

Tendances (20)

Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)
 
Graphics processing unit
Graphics processing unitGraphics processing unit
Graphics processing unit
 
GPU power consumption and performance trends
GPU power consumption and performance trendsGPU power consumption and performance trends
GPU power consumption and performance trends
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
 
Graphics Processing Unit by Saurabh
Graphics Processing Unit by SaurabhGraphics Processing Unit by Saurabh
Graphics Processing Unit by Saurabh
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Graphics processing unit (gpu)
Graphics processing unit (gpu)Graphics processing unit (gpu)
Graphics processing unit (gpu)
 
graphics processing unit ppt
graphics processing unit pptgraphics processing unit ppt
graphics processing unit ppt
 
Gpu databases
Gpu databasesGpu databases
Gpu databases
 
Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)Nvidia (History, GPU Architecture and New Pascal Architecture)
Nvidia (History, GPU Architecture and New Pascal Architecture)
 
Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
GRAPHICS PROCESSING UNIT (GPU)
GRAPHICS PROCESSING UNIT (GPU)GRAPHICS PROCESSING UNIT (GPU)
GRAPHICS PROCESSING UNIT (GPU)
 
GPU Computing: A brief overview
GPU Computing: A brief overviewGPU Computing: A brief overview
GPU Computing: A brief overview
 
Lec04 gpu architecture
Lec04 gpu architectureLec04 gpu architecture
Lec04 gpu architecture
 
Gpu
GpuGpu
Gpu
 
Example Application of GPU
Example Application of GPUExample Application of GPU
Example Application of GPU
 
Gpu and The Brick Wall
Gpu and The Brick WallGpu and The Brick Wall
Gpu and The Brick Wall
 
Gpu Systems
Gpu SystemsGpu Systems
Gpu Systems
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 

En vedette

Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentationVishal Singh
 
Low power vlsi design ppt
Low power vlsi design pptLow power vlsi design ppt
Low power vlsi design pptAnil Yadav
 
LinkedIn powerpoint
LinkedIn powerpointLinkedIn powerpoint
LinkedIn powerpointguest2137df
 

En vedette (7)

Gpu
GpuGpu
Gpu
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
Low Power VLSI Design
Low Power VLSI DesignLow Power VLSI Design
Low Power VLSI Design
 
Low power vlsi design ppt
Low power vlsi design pptLow power vlsi design ppt
Low power vlsi design ppt
 
LinkedIn powerpoint
LinkedIn powerpointLinkedIn powerpoint
LinkedIn powerpoint
 
Low Power Techniques
Low Power TechniquesLow Power Techniques
Low Power Techniques
 

Similaire à 19564926 graphics-processing-unit

The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)Akhila Dakshina
 
Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045Editor IJARCET
 
Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045Editor IJARCET
 
Throughput oriented aarchitectures
Throughput oriented aarchitecturesThroughput oriented aarchitectures
Throughput oriented aarchitecturesNomy059
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxssuser413a98
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
GPU Renderfarm with Integrated Asset Management & Production System (AMPS)
GPU Renderfarm with Integrated Asset Management & Production System (AMPS)GPU Renderfarm with Integrated Asset Management & Production System (AMPS)
GPU Renderfarm with Integrated Asset Management & Production System (AMPS)Budianto Tandianus
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
Gpu computing-webgl
Gpu computing-webglGpu computing-webgl
Gpu computing-webglVisCircle
 
“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...
“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...
“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...Edge AI and Vision Alliance
 
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkRed Hat Developers
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUsfcassier
 
Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningSri Ambati
 

Similaire à 19564926 graphics-processing-unit (20)

The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Stream Processing
Stream ProcessingStream Processing
Stream Processing
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)
 
Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045
 
Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045Volume 2-issue-6-2040-2045
Volume 2-issue-6-2040-2045
 
Throughput oriented aarchitectures
Throughput oriented aarchitecturesThroughput oriented aarchitectures
Throughput oriented aarchitectures
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
GPU Renderfarm with Integrated Asset Management & Production System (AMPS)
GPU Renderfarm with Integrated Asset Management & Production System (AMPS)GPU Renderfarm with Integrated Asset Management & Production System (AMPS)
GPU Renderfarm with Integrated Asset Management & Production System (AMPS)
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Gpu computing-webgl
Gpu computing-webglGpu computing-webgl
Gpu computing-webgl
 
“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...
“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...
“Is Your AI Data Pre-processing Fast Enough? Speed It Up Using rocAL,” a Pres...
 
GPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech TalkGPU enablement for data science on OpenShift | DevNation Tech Talk
GPU enablement for data science on OpenShift | DevNation Tech Talk
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUs
 
Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine Learning
 

Dernier

Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 

Dernier (20)

Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 

19564926 graphics-processing-unit

  • 1.
  • 2. Introduction What is GPU? • It is a processor optimized for 2D/3D graphics, video, visual computing, and display. • It is highly parallel, highly multithreaded multiprocessor optimized for visual computing. • It provide real-time visual interaction with computed objects via graphics images, and video. • It serves as both a programmable graphics processor and a scalable parallel computing platform.
  • 3. GPU Evolution • 1980’s – No GPU. PC used VGA controller • 1990’s – Add more function into VGA controller • 1997 – 3D acceleration functions: Hardware for triangle setup and rasterization Texture mapping Shading • 2000 – A single chip graphics processor ( beginning of GPU term) • 2005 – Massively parallel programmable processors • 2007 – CUDA (Compute Unified Device Architecture)
  • 4. GPU Graphic Trends • OpenGL – an open standard for 3D programming • DirectX – a series of Microsoft multimedia programming • • • • • • • interfaces New GPU are being developed every 12 to 18 months New idea of visual computing: combines graphics processing and parallel computing Heterogeneous System – CPU + GPU GPU evolves into scalable parallel processor GPU Computing: GPGPU and CUDA GPU unifies graphics and computing GPU visual computing application: OpenGL, and DirectX
  • 5. GPU System Architectures • CPU-GPU system architecture – The Historical PC – contemporary PC with Intel and AMD CPUs • Graphics Logical Pipeline • Basic Unified GPU Architecture – Processor Array
  • 6. Historical PC FIGURE A.2.1 Historical PC. VGA controller drives graphics display from framebuffer memory. Copyright © 2009 Elsevier, Inc. All rights reserved.
  • 7. Intel and AMD CPU FIGURE A.2.2 Contemporary PCs with Intel and AMD CPUs. See Chapter 6 for an explanation of the components and interconnects in this figure. Copyright © 2009 Elsevier
  • 8. Graphics Logical Pipeline FIGURE A.2.3 Graphics logical pipeline. Programmable graphics shader stages are blue, and fixed-function blocks are white. Copyright © 2009 Elsevier, Inc. All rights reserved.
  • 9. Basic Unified GPU Architecture FIGURE A.2.4 Logical pipeline mapped to physical processors. The programmable shader stages execute on the array of unified processors, and the logical graphics pipeline dataflow recirculates through the processors. Copyright © 2009 Elsevier, Inc. All rights reserved.
  • 10. Processor Array FIGURE A.2.5 Basic unified GPU architecture. Example GPU with 112 streaming processor (SP) cores organized in 14 streaming multiprocessors (SMs); the cores are highly multithreaded. It has the basic Tesla architecture of an NVIDIA GeForce 8800. The processors connect with four 64-bit-wide DRAM partitions via an interconnection network. Each SM has eight SP cores, two special function units (SFUs), instruction and constant caches, a multithreaded instruction unit, and a shared memory. Copyright © 2009 Elsevier, Inc. All rights reserved.
  • 11. Compare CPU and GPU Nemo-3D • Written by the CalTech Jet Propulsion Laboratory • NEMO-3D simulates quantum phenomena. • These models require a lot of matrix operations on very large matrices. • We are modifying the matrix operation functions so they use CUDA instead of that slow CPU.
  • 13. Testing - Matrices • Test the multiplication of two matrices. • Creates two matrices with random floating point values. • We tested with matrices of various dimensions…
  • 14. Results: DimTime 64x64 CUDA CPU 0.417465 ms 18.0876 ms 128x128 0.41691 ms 18.3007 ms 256x256 2.146367 ms 145.6302 ms 512x512 8.093004 ms 1494.7275 ms 768x768 25.97624 ms 4866.3246 ms 1024x1024 52.42811 ms 66097.1688 ms 2048x2048 407.648 ms Didn’t finish 4096x4096 3.1 seconds Didn’t finish
  • 15. In visible terms: 70000 CUDA CPU CPU regression CPU versus GPU CUDA regression 0.0085x y = 10.682e R2 = 0.9813 Execution time (ms) 60000 50000 40000 30000 20000 0.0053x y = 0.3526e 2 R = 0.9575 10000 0 0 200 400 600 800 Matrix side dimension 1000 1200
  • 16. Test results: Function Execute Time 500 y = 0.0228x - 0.5522 R2 = 1 450 Milliseconds 400 350 CUDA 300 CPU 250 CPU trendline 200 CUDA trendline 150 100 y = 0.0015x - 0.3449 R2 = 0.9996 50 0 0 5000 10000 15000 Number of Atoms 20000 25000