SlideShare une entreprise Scribd logo
1  sur  18
Multicore Processing Wu, Lieh-Hao VMI Class 2010 Computer Science
What’s Multicore?  Multiple cores in a single chip Improving performance by adding core Became the main stream in recent years     - Examples         - Core 2 dual, Core 2 Quad, Core-i3/5/7  Intel        - Athlon II X2, Phenom II X4, Opteron AMD        - Cell Broad Engine  IBM
Why Multicore? The difficulties of single core processor’s development     - Overheat     - Energy consumption     - Electron leakage     - Example        - Intel abandoned the project of 4GHz processor              in fall 2004  Multicore processor resolve these problem and has better performance
Research Introduction Purpose:     - To see the performance difference  between         single core and multicore processors How:     - Use the PS3 as the host machine    - Use the CPU of PS3 to execute a series of        matrix multiplication       - Execute with single core        - Execute with multicore          - programming tools are needed for handling                   cores    - Record the time and analysis the performance
Play Station 3 Physical Components CPU: Cell Broad Engine Memory: 256MB Storage: 80GB Software  Yellow Dog Linux Cell SDK
Cell Broad Engine Processor Developed by Sony, Toshiba, and IBM jointly. Multicore structure   - Power Processing Element x 1    (PPE)     - Like a traditional processor     - It has its own L1, L2 cache   - Synergistic Processing Element x 8      (SPE)      - Can be used synchronously     - It has 256KB local storage
Matrix Multiplication Simple but time consuming Some assumptions are made for research purpose    - Dimension is set to N2     - Data type is set to double    - Only even numbers are        applied
Sequoia ,[object Object]
Mapping a tree structure as a memory hierarchy
Basic idea  - Consist of three functions   - task<inner>: distribute   - task<leaf>: compute   - task<ext>: connect
Programming in Sequoia To programming in Sequoia, four files are required to run the matrix multiplication.    - “Makefile”  for compiling    - “matrixmult.sq” Sequoia program    - “mapping_ps3.xml”  for mapping     - “main.cc”  for starting  During the process    - Good documentation    - Good adaptability for different purposes    - Details need to be handled by programmers
Cellgen An implicit multicore programming model C/C++ based programming tool Like OpenMP style    - OpenMP API Basic idea    - Starts after “#pragma cell”    - Parameters       - public: shared by SPEs       - private: each SPE has a copy Scott Schneider Ph.D. Candidate Virginia Tech 
Programming in Cellgen There are files needed to run matrix multiplication    - Two “Makefile”  for compiling    - One “matrixmult.cellgen”  Cellgen code    - One “double16b_t.h”  for padding column data       - suggested by the author to improve performance During the process    - Understandable       - C/C++ based; easy to catch up.    - Lack of documentation       - Only “Readme” file is available.
Result in Table The following is the table for the execution time of PPE only, SPE with Sequoia, and SPE with Cellgen. ,[object Object],   - Oversize matrix will be swapped between disk     and main memory. ,[object Object],   - Either no response or bus error.
Result in Graph (1) The following is the line chart generated from the data of the table. Memory size limit PPE Only Cellgen Sequoia
Result in Graph (2)
Result Analysis Performance of Cellgen   - Unexpected overhead or runtime error may       occur and throw the performance back. Performance of Sequoia   - According to the stable record, it is about 8       times faster than the execution time of PPE.     - Although the memory size is 256MB,  performance starts dropping down after 2048 2.    - The performance becomes the same with PPE    after reaching 4096 2 .       - Probably the most of the data are swapped          with disk, which is out of the Sequoia’s ability.
Conclusion Multicore processor has better performance than single core processor, which is about 8 times faster if the memory space is sufficient. Multicore may also have some unexpected overhead or error, which may draw back the performance like what I have in Cellgen.  Multicore processing is art.    - In the paper “ Programming Multiprocessors With      Explicitly Managed Memory Hierarchies,” Cellgen         has better performance than Sequoia does. However,      Cellgen doesn’t do well like Sequoia in this     research.

Contenu connexe

Tendances

Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architecturesnextlib
 
CA presentation of multicore processor
CA presentation of multicore processorCA presentation of multicore processor
CA presentation of multicore processorZeeshan Aslam
 
Multi_Core_Processor_2015_(Download it!)
Multi_Core_Processor_2015_(Download it!)Multi_Core_Processor_2015_(Download it!)
Multi_Core_Processor_2015_(Download it!)Sudip Roy
 
Single and Multi core processor
Single and Multi core processorSingle and Multi core processor
Single and Multi core processorMunaam Munawar
 
Computer architecture multi core processor
Computer architecture multi core processorComputer architecture multi core processor
Computer architecture multi core processorMazin Alwaaly
 
Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentationvinaya.hs
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiAnkit Raj
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecturePiyush Mittal
 
Multi core processors
Multi core processorsMulti core processors
Multi core processorsNipun Sharma
 
Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architectureJawid Ahmad Baktash
 
Study of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsStudy of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsateeq ateeq
 
Intel core i7 processor
Intel core i7 processorIntel core i7 processor
Intel core i7 processorGautam Kumar
 
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMSCOMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMSijcsit
 
Multiprocessor architecture and programming
Multiprocessor architecture and programmingMultiprocessor architecture and programming
Multiprocessor architecture and programmingRaul Goycoolea Seoane
 

Tendances (20)

Multi core processor
Multi core processorMulti core processor
Multi core processor
 
Multicore computers
Multicore computersMulticore computers
Multicore computers
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architectures
 
Multi core processors
Multi core processorsMulti core processors
Multi core processors
 
CA presentation of multicore processor
CA presentation of multicore processorCA presentation of multicore processor
CA presentation of multicore processor
 
Multi_Core_Processor_2015_(Download it!)
Multi_Core_Processor_2015_(Download it!)Multi_Core_Processor_2015_(Download it!)
Multi_Core_Processor_2015_(Download it!)
 
Single and Multi core processor
Single and Multi core processorSingle and Multi core processor
Single and Multi core processor
 
Computer architecture multi core processor
Computer architecture multi core processorComputer architecture multi core processor
Computer architecture multi core processor
 
Quad Core Processors - Technology Presentation
Quad Core Processors - Technology PresentationQuad Core Processors - Technology Presentation
Quad Core Processors - Technology Presentation
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash Prajapati
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecture
 
Multi core processors
Multi core processorsMulti core processors
Multi core processors
 
Introduction to multicore .ppt
Introduction to multicore .pptIntroduction to multicore .ppt
Introduction to multicore .ppt
 
Dual-core processor
Dual-core processorDual-core processor
Dual-core processor
 
Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architecture
 
Study of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsStudy of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processors
 
Multicore Processors
Multicore ProcessorsMulticore Processors
Multicore Processors
 
Intel core i7 processor
Intel core i7 processorIntel core i7 processor
Intel core i7 processor
 
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMSCOMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
 
Multiprocessor architecture and programming
Multiprocessor architecture and programmingMultiprocessor architecture and programming
Multiprocessor architecture and programming
 

Similaire à Multicore processing

Scalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-ProcessorScalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-ProcessorLou Loizides
 
Term Project Presentation (4)
Term Project Presentation (4)Term Project Presentation (4)
Term Project Presentation (4)Louis Loizides PE
 
Brief Introduction to Parallella
Brief Introduction to ParallellaBrief Introduction to Parallella
Brief Introduction to ParallellaSomnath Mazumdar
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor DesignSri Prasanna
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesDr. Fabio Baruffa
 
Intel 8th generation and 7th gen microprocessor full details especially for t...
Intel 8th generation and 7th gen microprocessor full details especially for t...Intel 8th generation and 7th gen microprocessor full details especially for t...
Intel 8th generation and 7th gen microprocessor full details especially for t...Chessin Chacko
 
Gpu and The Brick Wall
Gpu and The Brick WallGpu and The Brick Wall
Gpu and The Brick Wallugur candan
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSAIRCC Publishing Corporation
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSijcsit
 
OSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerOSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerNETWAYS
 
OSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerOSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerNETWAYS
 
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerOSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerNETWAYS
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudRose Toomey
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudDatabricks
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...Chester Chen
 
Intel new processors
Intel new processorsIntel new processors
Intel new processorszaid_b
 
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Slide_N
 
Analysis of Multicore Performance Degradation of Scientific Applications
Analysis of Multicore Performance Degradation of Scientific ApplicationsAnalysis of Multicore Performance Degradation of Scientific Applications
Analysis of Multicore Performance Degradation of Scientific ApplicationsJames McGalliard
 

Similaire à Multicore processing (20)

Scalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-ProcessorScalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
Scalable Matrix Multiplication for the 16 Core Epiphany Co-Processor
 
Term Project Presentation (4)
Term Project Presentation (4)Term Project Presentation (4)
Term Project Presentation (4)
 
Brief Introduction to Parallella
Brief Introduction to ParallellaBrief Introduction to Parallella
Brief Introduction to Parallella
 
Parallelism Processor Design
Parallelism Processor DesignParallelism Processor Design
Parallelism Processor Design
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
Intel 8th generation and 7th gen microprocessor full details especially for t...
Intel 8th generation and 7th gen microprocessor full details especially for t...Intel 8th generation and 7th gen microprocessor full details especially for t...
Intel 8th generation and 7th gen microprocessor full details especially for t...
 
Gpu and The Brick Wall
Gpu and The Brick WallGpu and The Brick Wall
Gpu and The Brick Wall
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONSMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
 
QSpiders - Basic intel architecture
QSpiders - Basic intel architectureQSpiders - Basic intel architecture
QSpiders - Basic intel architecture
 
Processors
ProcessorsProcessors
Processors
 
OSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner FischerOSDC 2017 | Open POWER for the data center by Werner Fischer
OSDC 2017 | Open POWER for the data center by Werner Fischer
 
OSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data centerOSDC 2017 - Werner Fischer - Open power for the data center
OSDC 2017 - Werner Fischer - Open power for the data center
 
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner FischerOSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2017 | Linux Performance Profiling and Monitoring by Werner Fischer
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
Apache Spark At Scale in the Cloud
Apache Spark At Scale in the CloudApache Spark At Scale in the Cloud
Apache Spark At Scale in the Cloud
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
Intel new processors
Intel new processorsIntel new processors
Intel new processors
 
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
Multiple Cores, Multiple Pipes, Multiple Threads – Do we have more Parallelis...
 
Analysis of Multicore Performance Degradation of Scientific Applications
Analysis of Multicore Performance Degradation of Scientific ApplicationsAnalysis of Multicore Performance Degradation of Scientific Applications
Analysis of Multicore Performance Degradation of Scientific Applications
 

Multicore processing

  • 1. Multicore Processing Wu, Lieh-Hao VMI Class 2010 Computer Science
  • 2. What’s Multicore? Multiple cores in a single chip Improving performance by adding core Became the main stream in recent years - Examples - Core 2 dual, Core 2 Quad, Core-i3/5/7  Intel - Athlon II X2, Phenom II X4, Opteron AMD - Cell Broad Engine  IBM
  • 3. Why Multicore? The difficulties of single core processor’s development - Overheat - Energy consumption - Electron leakage - Example - Intel abandoned the project of 4GHz processor in fall 2004 Multicore processor resolve these problem and has better performance
  • 4. Research Introduction Purpose: - To see the performance difference between single core and multicore processors How: - Use the PS3 as the host machine - Use the CPU of PS3 to execute a series of matrix multiplication - Execute with single core - Execute with multicore - programming tools are needed for handling cores - Record the time and analysis the performance
  • 5. Play Station 3 Physical Components CPU: Cell Broad Engine Memory: 256MB Storage: 80GB Software Yellow Dog Linux Cell SDK
  • 6. Cell Broad Engine Processor Developed by Sony, Toshiba, and IBM jointly. Multicore structure - Power Processing Element x 1 (PPE) - Like a traditional processor - It has its own L1, L2 cache - Synergistic Processing Element x 8 (SPE) - Can be used synchronously - It has 256KB local storage
  • 7. Matrix Multiplication Simple but time consuming Some assumptions are made for research purpose - Dimension is set to N2 - Data type is set to double - Only even numbers are applied
  • 8.
  • 9. Mapping a tree structure as a memory hierarchy
  • 10. Basic idea - Consist of three functions - task<inner>: distribute - task<leaf>: compute - task<ext>: connect
  • 11. Programming in Sequoia To programming in Sequoia, four files are required to run the matrix multiplication. - “Makefile”  for compiling - “matrixmult.sq” Sequoia program - “mapping_ps3.xml”  for mapping - “main.cc”  for starting During the process - Good documentation - Good adaptability for different purposes - Details need to be handled by programmers
  • 12. Cellgen An implicit multicore programming model C/C++ based programming tool Like OpenMP style - OpenMP API Basic idea - Starts after “#pragma cell” - Parameters - public: shared by SPEs - private: each SPE has a copy Scott Schneider Ph.D. Candidate Virginia Tech 
  • 13. Programming in Cellgen There are files needed to run matrix multiplication - Two “Makefile”  for compiling - One “matrixmult.cellgen”  Cellgen code - One “double16b_t.h”  for padding column data - suggested by the author to improve performance During the process - Understandable - C/C++ based; easy to catch up. - Lack of documentation - Only “Readme” file is available.
  • 14.
  • 15. Result in Graph (1) The following is the line chart generated from the data of the table. Memory size limit PPE Only Cellgen Sequoia
  • 17. Result Analysis Performance of Cellgen - Unexpected overhead or runtime error may occur and throw the performance back. Performance of Sequoia - According to the stable record, it is about 8 times faster than the execution time of PPE. - Although the memory size is 256MB, performance starts dropping down after 2048 2. - The performance becomes the same with PPE after reaching 4096 2 . - Probably the most of the data are swapped with disk, which is out of the Sequoia’s ability.
  • 18. Conclusion Multicore processor has better performance than single core processor, which is about 8 times faster if the memory space is sufficient. Multicore may also have some unexpected overhead or error, which may draw back the performance like what I have in Cellgen. Multicore processing is art. - In the paper “ Programming Multiprocessors With Explicitly Managed Memory Hierarchies,” Cellgen has better performance than Sequoia does. However, Cellgen doesn’t do well like Sequoia in this research.
  • 19. Reference http://elhabib.at/files/2008/07/yellowdog-vorlage_p1.jpg http://scawley.files.wordpress.com/2008/03/sony_playstation_3_60gb_game_console__brand_new.jpg http://www.5ilight.com/dianzi/upimg/20070222/11H154H0L05A08.jpg http://moss.csc.ncsu.edu/~mueller/cluster/ps3/cell.jpg http://upload.wikimedia.org/wikipedia/en/thumb/e/eb/Matrix_multiplication_diagram_2.svg/313px-Matrix_multiplication_diagram_2.svg.png http://www.stanford.edu/group/sequoia/cgi-bin/node/182 http://www.stanford.edu/group/sequoia/cgi-bin/ http://openmp.org/wp/about-openmp/ http://people.cs.vt.edu/~scschnei/pictures/scott.jpg http://openmp.org/wp/openmp_336x120.gif http://www.ibm.com/developerworks/power/library/pa-cellperf/ Ramanathan, R. M. “Intel® Multi-Core Processors: Making the Move to Quad-Core and Beyond.” Intel Multi-CoreProcessors. pp.3, 15 November 2008 http://www.intel.com/technology/architecture/downloads/quad-core-06.pdf . Sutter, Hurb. “The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software.” Dr. Dobb’s Journal, 30(3), March 2005 < http://lyle.smu.edu/~coyle/cse8313/handouts/Free.lunch.over.pdf>. http://www.stanford.edu/group/sequoia/cgi-bin/ http://github.com/scotts/cellgen/ http://en.wikipedia.org/wiki/Cell_(microprocessor) Martin Linklater. "Optimizing Cell Code". Game Developer Magazine, April 2007: pp. 15–18. "To increase fabrication yelds, Sony ships PlayStation 3 Cell processors with only seven working SPEs. And from those seven, one SPE will be used by the operating system for various tasks, This leaves six SPEs for game programmer to use.“ Scott Schneider, Jae-SeungYeom and Dimitrios S. Nikolopoulos. Programming Multiprocessors With Explicitly Managed Memory Hierarchies. IEEE Computer, December, 2009.

Notes de l'éditeur

  1. Title page
  2. Main introduction about what multicore is. Take the old English paper as reference!
  3. The reasons why the CPU manufactures change from single core to multicore.
  4. Research intro. Talking about what my purpose is, how I test and justify my results, and what application, programming models, and host machine I will use. (PS. Application -&gt; large dimension matrix; Programming models -&gt; Sequoia, Cellgen; Host machine -&gt; PS3)
  5. Introduce PS3 with its memory size, CPU, and what OS we use for. (PS. Main memory -&gt; 256MB; CPU -&gt; Cell Broad Engine; OS -&gt; Yellow Dog Linux) Don’t forget to mention that Cell SDK is necessary for developing Cell CPU!Picture for YDL: http://elhabib.at/files/2008/07/yellowdog-vorlage_p1.jpg Picture: http://scawley.files.wordpress.com/2008/03/sony_playstation_3_60gb_game_console__brand_new.jpg
  6. Introduction of Cell Broad Engine multicore processor. PPE L1 cache  32KB; L2 cache  512KBStructure Picture  http://www.5ilight.com/dianzi/upimg/20070222/11H154H0L05A08.jpgPicture  http://moss.csc.ncsu.edu/~mueller/cluster/ps3/cell.jpg
  7. Application is a series of matrix multiplication. Also, put the reason why I choose matrix multiplication to be my application. Picture: http://upload.wikimedia.org/wikipedia/en/thumb/e/eb/Matrix_multiplication_diagram_2.svg/313px-Matrix_multiplication_diagram_2.svg.png
  8. Brief introduction about Sequoia. Major points -&gt; explicit local storage management, mapping a tree structure as a memory hierarchy, and major programming points(inner, leaf, and ext tasks).Picture tree structure: http://www.stanford.edu/group/sequoia/cgi-bin/node/182Sequoia logo: http://www.stanford.edu/group/sequoia/cgi-bin/
  9. Connect to matrixmult.sq, matrixmult_ps3_mapping.xml, and main.cc files here and explain briefly. Then, talk about how I feel during the process. (Basic idea -&gt; good documentation and adaptability for different purpose, but programmer has to handle much more in detail!) Also remind that five files are required to use Sequoia: two Makefile files (for compile purpose), xxx.sq code (main program), xxx.xml (for mapping purpose), main.cc (for execution purpose)
  10. Introduction about Cellgen. Major points: C/C++ based software tool, implicit local storage management, OpenMP-like support. (PS. OpenMP needs to be explained -&gt; orally brief explanation; use http://openmp.org/wp/about-openmp/ as reference!!) Author photo: http://people.cs.vt.edu/~scschnei/pictures/scott.jpgOpenMP logo: http://openmp.org/wp/openmp_336x120.gif
  11. Put matrixmult.cellgen and double16b_t.h files here and explain briefly. Also mention that the problem of lack of documentation and the problem the author said.
  12. Put the overall result table here and “explain”. Do not say too much here, analysis will be left on later slides!
  13. Put the result graph here and “explain”. Do not say too much here.
  14. The line chart about
  15. Just leave the “important” partial data here and explain my analysis. Major point: how fast can Sequoia reach, Sequoia and Cellgen limit of the physical main memory which only has 256MB, and the unexpected poor performance of Cellgen (maybe some potential overhead draw back the overall performance).
  16. Multicore processing is art!
  17. Reference list: form old EN paper, from OpenMP website, from Cellgen author websites, from Sequoia websites, the IEEE magazine.
  18. Questions be prepared!http://www.cmoe.com/blog/wp-content/images/question-mark.jpg