5. CUDA
Programming
MULTICORE vs MANY-CORE
CPU
GPU
•
Based on pipeline philosophy
•
Big amount of parallel data
•
A lot of strucutres for cache and
•
Less strucutres for cache and
control
control
•
More flexible
•
Less flexibility
•
MIMD – task parallelism
•
SIMD – data parallelism
•
Latency sensible
•
Latency tolerant
PUCRS - PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO
12. CUDA
Programming
CUDA PROGRAMMING MODEL
thread ID
•
Basic unit – kernel
•
2
3
4
…
Thread array
•
1
Synchronous or asynchronous
•
0
Array / matrix / cube
[1] Keutzer, K.,Malik, S. Newton, A.R., Rabaye, J.M. and Sangiovanni Vincentelli, A.: System-level design: orthogonolization of concerns and platform-based
design. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst.,2000,19, (12), pp. 1523-1543
PUCRS - PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO
13. CUDA
Programming
CUDA PROGRAMMING MODEL
•
Basic unit – kernel
•
Synchronous or asynchronous
•
Thread array
•
Array / matrix / cube
PUCRS - PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO