SlideShare une entreprise Scribd logo
1  sur  48
Technology Consulting and Engineering




                                   GPGPU Programming
                                    on Android devices
                                             Alten droidconNL 2012




   ALTEN | 11/22/12
Welcome


•   Alten PTS; leading service provider in the field of technical
    consultancy and engineering
      Eindhoven, Capelle aan de IJssel and Apeldoorn




•   ir. Arjan Somers




     Alten-droidconNL 2012                                     Slide 2
Goals




●What is GPGPU?
●How is it done on current Android devices?

●When is GPGPU programming useful?




    Alten-droidconNL 2012            Slide 3
Contents




●What
●Why

●How

●When

●Example




    Alten-droidconNL 2012   Slide 4
What




●GPGPU programming is using the GPU to
perform general purpose calculations
● Data manipulation using the graphic card




    Alten-droidconNL 2012            Slide 5
What


A little bit of history
●

● GPU's and OpenGL

● Parallel vector based operations

● Programmable (Shaders)




       Battlezone (1980)    Crysis 3 (2013)

    Alten-droidconNL 2012                 Slide 6
What


A little bit of history
●

● GPU's and OpenGL

● Parallel vector based operations

● Programmable (Shaders)




    Alten-droidconNL 2012            Slide 7
What: parallel proccessing


Parallel vector based operations
●

● Process vertices / pixels

● Independent/parallel processing




    Alten-droidconNL 2012           Slide 8
What: programmable GPU's


Displacement mapping
●

● Requires control over

  vertex and fragment
  processing




      Alten-droidconNL 2012    Slide 9
What: programmable GPU's




OpenGL Pipeline
●




    Alten-droidconNL 2012   Slide 10
What: programmable GPU's




OpenGL Pipeline
●




                      Shaders

    Alten-droidconNL 2012       Slide 11
What: programmable GPU's


 ●Shaders



        Vertex                             Fragment

void main(void)                void main(void)
{                              {
    gl_Position = gl_Vertex;       gl_FragColor = vec4(0.0, 1.0, 0.0, 1.0);
}                              }




     Alten-droidconNL 2012                                 Slide 12
What: programmable GPU's


A fragment shader
●



    void main(void)
    {
        vec2 pos = mod(gl_FragCoord.xy, vec2(50.0)) - vec2(25.0);
        float dist_squared = dot(pos, pos);

         gl_FragColor = (dist_squared < 400.0)
             ? vec4(.90, .90, .90, 1.0)
             : vec4(.20, .20, .40, 1.0);
    }




        Alten-droidconNL 2012                                Slide 13
What: General calculations


Simple GPGPU
●

● Upload input data in textures

● Render quad

● Perform calculations in pixel shader

● Read back results




    Alten-droidconNL 2012            Slide 14
What: General calculations




         Java
         App




  Alten-droidconNL 2012      Slide 15
What: General calculations


                          Upload()

                  Upload()
         Java
         App




  Alten-droidconNL 2012              Slide 16
What: General calculations




         Java
         App
                          Draw()




  Alten-droidconNL 2012            Slide 17
What: General calculations




         Java
         App

                      Read()




  Alten-droidconNL 2012        Slide 18
What

A CPU program vs a GPU Program
●




              int[] a;      int[] b=F(a);     int[] b;




             Texture          draw()        pixel-buffer



              int[] a;                        int[] b;




    Alten-droidconNL 2012                                  Slide 19
Why



●Additional computational power
 ● Can run in parallel with CPU

●Greater computational power

 ● Galaxy S2

   ● CPU 4 GFlops

   ● GPU 10 GFlops




    Alten-droidconNL 2012         Slide 20
How




1) Parallelize code
2) Data packing
3) Implement OpenGL ES 2.0 Shaders
4) Drawing and Input/Output




  Alten-droidconNL 2012         Slide 21
How: Parallelize code


Pixel-shader parallel code:

  public void foo(int[] a, int[] b){
      int[] c = new int[a.length];

      for(int i=0; i<a.length; i++){
          c[i] = a[i] + 2*b[i];
      }
  }




  Alten-droidconNL 2012                Slide 22
How: Parallelize code

Not Pixel-shader parallel code:
 (Mobile GPU have no geometry shaders)

public int[] bar(int[] a, int[] b, int[] c){
    int[] d = new int[a.length];

     for(int i=0; i<a.length; i++){
         d[b[i]] += a[i];
         d[c[i]] += a[i];
     }
     return d;
}




    Alten-droidconNL 2012                      Slide 23
How: Parallelize code




Not Pixel-shader parallel code:
● Sequence of calculations




  Alten-droidconNL 2012           Slide 24
How: Data packing




Current mobile GPU's only have
 ● 8bpp buffers and textures

 ● single render target




  Alten-droidconNL 2012          Slide 25
How: Data packing



Packing uses
●

● Floating point in/out-put

● Physics simulation

  requiring multiple outputs




    Alten-droidconNL 2012      Slide 26
How: Implement Shaders




Will be shown later in detail
●

 ● Use OpenGL ES 2.0

 ● No CUDA, OpenCL or similar




    Alten-droidconNL 2012       Slide 27
How: Drawing and Input/Output




Transfer is slow
●




             Texture        draw()   pixel-buffer



             Int[] a;                  Int[] b;




    Alten-droidconNL 2012                           Slide 28
When: What works, what not




●Parallelism
 ● No geometry shaders

●Limited precision / Single Render Target

●Limited data transfer

●Not yet as fast a desktop




    Alten-droidconNL 2012            Slide 29
Example


AES encryption on the GPU
●




                 “Hello droiconNL!”

                            Encryption

       U2FsdGVkX18UAXwN1I7bomP0kuKNXwQ8h
       2NHb8lZ5sAG6uaLjZxzkn/ik9QPv8Pq

                            Decryption

                   “Hello droiconNL!”



    Alten-droidconNL 2012                  Slide 30
Example



            Encoding     Decoding




 Alten-droidconNL 2012              Slide 31
Example



            Encoding        Decoding




        Not GPU-Parallel   GPU-Parallel




 Alten-droidconNL 2012                 Slide 32
Example




 Alten-droidconNL 2012   Slide 33
Example


    Are all parts implementable on gpu?




 Alten-droidconNL 2012             Slide 34
Example


     Are all parts implementable on gpu?



                      Parallelizable?
                     Packing required?




 Alten-droidconNL 2012                   Slide 35
Example

                 Implementing shader




                                         Dec      Hex
                                         0        00
                                         25       19
                                         255      FF




 Alten-droidconNL 2012                 Slide 36
Example

                 Implementing shader




                                         Dec      Hex
                                         0        00
                                         25       19
                                         255      FF




 Alten-droidconNL 2012                 Slide 37
Example

                    Implementing shader
●Parallelizable?
●Packing?

●Steps:

  ● Find row/column using hex-digits

  ● Find new value in substitution table




    Alten-droidconNL 2012                 Slide 38
Example

uniform sampler2D SBox;
uniform sampler2D InputTexture;
varying vec2 vTexCoord;

void main(void)
{
   vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255]

    vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0;
    vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0;

    gl_FragColor = vec4(
      texture2D(SBox, vec2(firstHexDigits.x,   lastHexDigits.x)).x),
      texture2D(SBox, vec2(firstHexDigits.y,   lastHexDigits.y)).x),
      texture2D(SBox, vec2(firstHexDigits.z,   lastHexDigits.z)).x),
      texture2D(SBox, vec2(firstHexDigits.w,   lastHexDigits.w)).x));
}




      Alten-droidconNL 2012                                              Slide 39
Example

uniform sampler2D SBox;
uniform sampler2D InputTexture;                           Get four bytes from input
varying vec2 vTexCoord;

void main(void)
{
   vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255]

    vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0;
    vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0;

    gl_FragColor = vec4(
      texture2D(SBox, vec2(firstHexDigits.x,   lastHexDigits.x)).x),
      texture2D(SBox, vec2(firstHexDigits.y,   lastHexDigits.y)).x),
      texture2D(SBox, vec2(firstHexDigits.z,   lastHexDigits.z)).x),
      texture2D(SBox, vec2(firstHexDigits.w,   lastHexDigits.w)).x));
}




      Alten-droidconNL 2012                                               Slide 40
Example

uniform sampler2D SBox;
uniform sampler2D InputTexture;                        Convert from [0, 1) to [0, 255]
varying vec2 vTexCoord;

void main(void)
{
   vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255]

    vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0;
    vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0;

    gl_FragColor = vec4(
      texture2D(SBox, vec2(firstHexDigits.x,   lastHexDigits.x)).x),
      texture2D(SBox, vec2(firstHexDigits.y,   lastHexDigits.y)).x),
      texture2D(SBox, vec2(firstHexDigits.z,   lastHexDigits.z)).x),
      texture2D(SBox, vec2(firstHexDigits.w,   lastHexDigits.w)).x));
}




      Alten-droidconNL 2012                                                Slide 41
Example

                                                 No bitshifting, but some build-in functions
uniform sampler2D SBox;
uniform sampler2D InputTexture;                    floor(X / 16) = floor(X / 2^4) = X << 4
varying vec2 vTexCoord;                                  Scale back to [0, 1) range
void main(void)
{
   vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255]

    vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0;
    vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0;

    gl_FragColor = vec4(
      texture2D(SBox, vec2(firstHexDigits.x,   lastHexDigits.x)).x),
      texture2D(SBox, vec2(firstHexDigits.y,   lastHexDigits.y)).x),
      texture2D(SBox, vec2(firstHexDigits.z,   lastHexDigits.z)).x),
      texture2D(SBox, vec2(firstHexDigits.w,   lastHexDigits.w)).x));
}




      Alten-droidconNL 2012                                                Slide 42
Example

uniform sampler2D SBox;                          No masking, but some build-in functions
uniform sampler2D InputTexture;                  mod(X / 16) = mod(X / 2^4) = X & 0x0F
varying vec2 vTexCoord;

void main(void)
{
   vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255]

    vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0;
    vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0;

    gl_FragColor = vec4(
      texture2D(SBox, vec2(firstHexDigits.x,   lastHexDigits.x)).x),
      texture2D(SBox, vec2(firstHexDigits.y,   lastHexDigits.y)).x),
      texture2D(SBox, vec2(firstHexDigits.z,   lastHexDigits.z)).x),
      texture2D(SBox, vec2(firstHexDigits.w,   lastHexDigits.w)).x));
}




      Alten-droidconNL 2012                                              Slide 43
Example

uniform sampler2D SBox;
uniform sampler2D InputTexture;                       Find the new value in the s-box
varying vec2 vTexCoord;

void main(void)
{
   vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255]

    vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0;
    vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0;

    gl_FragColor = vec4(
      texture2D(SBox, vec2(firstHexDigits.x,   lastHexDigits.x)).x),
      texture2D(SBox, vec2(firstHexDigits.y,   lastHexDigits.y)).x),
      texture2D(SBox, vec2(firstHexDigits.z,   lastHexDigits.z)).x),
      texture2D(SBox, vec2(firstHexDigits.w,   lastHexDigits.w)).x));
}




      Alten-droidconNL 2012                                               Slide 44
Example

uniform sampler2D SBox;
uniform sampler2D InputTexture;                       Pack four values in output pixel
varying vec2 vTexCoord;

void main(void)
{
   vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255]

    vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0;
    vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0;

    gl_FragColor = vec4(
      texture2D(SBox, vec2(firstHexDigits.x,   lastHexDigits.x)).x),
      texture2D(SBox, vec2(firstHexDigits.y,   lastHexDigits.y)).x),
      texture2D(SBox, vec2(firstHexDigits.z,   lastHexDigits.z)).x),
      texture2D(SBox, vec2(firstHexDigits.w,   lastHexDigits.w)).x));
}




      Alten-droidconNL 2012                                               Slide 45
My experiences


●OpenGL ES is limited vs Desktop
 ● Geometry shaders

 ● Buffer formats / no MRT's

●Sometimes difficult to debug

 ● Dithering

 ● NPOT

●Complex algorithms are possible

 ● Computer vision implemented

●Large speed gains are possible




    Alten-droidconNL 2012          Slide 46
Conclusion


●How is GPGPU programming performed on
Android devices?
 ● Trough the use of shaders and textures

●When is GPGPU a viable option?

 ● Calculations are consuming too much time

 ● Calculations are parallelizable

 ● Can be implemented using 32 bit buffers

 ● Limited transfer GPU-CPU memory

   required

    Alten-droidconNL 2012          Slide 47
Conclusion




●GPGPU programming has high potential
●Mobile GPU are becoming faster

●GPGPU programming is fun




    Alten-droidconNL 2012        Slide 48

Contenu connexe

Tendances

Graphics programming in open gl
Graphics programming in open glGraphics programming in open gl
Graphics programming in open glArvind Devaraj
 
Necessitas - Qt on Android - from FSCONS 2011
Necessitas - Qt on Android - from FSCONS 2011Necessitas - Qt on Android - from FSCONS 2011
Necessitas - Qt on Android - from FSCONS 2011Johan Thelin
 
Chapter02 graphics-programming
Chapter02 graphics-programmingChapter02 graphics-programming
Chapter02 graphics-programmingMohammed Romi
 
SIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGLSIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGLMark Kilgard
 
Building the QML Run-time
Building the QML Run-timeBuilding the QML Run-time
Building the QML Run-timeJohan Thelin
 
Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...Stanfy
 
Tutorial Open GL (Listing Code)
Tutorial Open GL (Listing Code)Tutorial Open GL (Listing Code)
Tutorial Open GL (Listing Code)Aila Gema Safitri
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityMark Kilgard
 
QThreads: Are You Using Them Wrong?
QThreads: Are You Using Them Wrong? QThreads: Are You Using Them Wrong?
QThreads: Are You Using Them Wrong? ICS
 
Getting Started with OpenGL ES
Getting Started with OpenGL ESGetting Started with OpenGL ES
Getting Started with OpenGL ESJohn Wilker
 
The Kokkos C++ Performance Portability EcoSystem
The Kokkos C++ Performance Portability EcoSystemThe Kokkos C++ Performance Portability EcoSystem
The Kokkos C++ Performance Portability EcoSysteminside-BigData.com
 
Scripting Your Qt Application
Scripting Your Qt ApplicationScripting Your Qt Application
Scripting Your Qt Applicationaccount inactive
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading LanguageJungsoo Nam
 
Special Effects with Qt Graphics View
Special Effects with Qt Graphics ViewSpecial Effects with Qt Graphics View
Special Effects with Qt Graphics Viewaccount inactive
 
CS 354 Introduction
CS 354 IntroductionCS 354 Introduction
CS 354 IntroductionMark Kilgard
 

Tendances (20)

Graphics programming in open gl
Graphics programming in open glGraphics programming in open gl
Graphics programming in open gl
 
Necessitas - Qt on Android - from FSCONS 2011
Necessitas - Qt on Android - from FSCONS 2011Necessitas - Qt on Android - from FSCONS 2011
Necessitas - Qt on Android - from FSCONS 2011
 
Cross Platform Qt
Cross Platform QtCross Platform Qt
Cross Platform Qt
 
Baiscs of OpenGL
Baiscs of OpenGLBaiscs of OpenGL
Baiscs of OpenGL
 
Qt Workshop
Qt WorkshopQt Workshop
Qt Workshop
 
Chapter02 graphics-programming
Chapter02 graphics-programmingChapter02 graphics-programming
Chapter02 graphics-programming
 
SIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGLSIGGRAPH Asia 2008 Modern OpenGL
SIGGRAPH Asia 2008 Modern OpenGL
 
Building the QML Run-time
Building the QML Run-timeBuilding the QML Run-time
Building the QML Run-time
 
Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...
 
Tutorial Open GL (Listing Code)
Tutorial Open GL (Listing Code)Tutorial Open GL (Listing Code)
Tutorial Open GL (Listing Code)
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL Functionality
 
QThreads: Are You Using Them Wrong?
QThreads: Are You Using Them Wrong? QThreads: Are You Using Them Wrong?
QThreads: Are You Using Them Wrong?
 
The Future of Qt Widgets
The Future of Qt WidgetsThe Future of Qt Widgets
The Future of Qt Widgets
 
Getting Started with OpenGL ES
Getting Started with OpenGL ESGetting Started with OpenGL ES
Getting Started with OpenGL ES
 
The Kokkos C++ Performance Portability EcoSystem
The Kokkos C++ Performance Portability EcoSystemThe Kokkos C++ Performance Portability EcoSystem
The Kokkos C++ Performance Portability EcoSystem
 
Scripting Your Qt Application
Scripting Your Qt ApplicationScripting Your Qt Application
Scripting Your Qt Application
 
OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading Language
 
Special Effects with Qt Graphics View
Special Effects with Qt Graphics ViewSpecial Effects with Qt Graphics View
Special Effects with Qt Graphics View
 
Android native gl
Android native glAndroid native gl
Android native gl
 
CS 354 Introduction
CS 354 IntroductionCS 354 Introduction
CS 354 Introduction
 

En vedette

Gpu Systems
Gpu SystemsGpu Systems
Gpu Systemsjpaugh
 
Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360
Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360
Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360Simon Scarle
 
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...Umbra Software
 
GPGPU in scientifc applications
GPGPU in scientifc applicationsGPGPU in scientifc applications
GPGPU in scientifc applicationssdart
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)PhtRaveller
 
E-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPUE-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPUNur Ahmadi
 
GPGPU Education at Nagaoka University of Technology: A Trial Run
GPGPU Education at Nagaoka University of Technology: A Trial RunGPGPU Education at Nagaoka University of Technology: A Trial Run
GPGPU Education at Nagaoka University of Technology: A Trial Run智啓 出川
 
General Programming on the GPU - Confoo
General Programming on the GPU - ConfooGeneral Programming on the GPU - Confoo
General Programming on the GPU - ConfooSirKetchup
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeOfer Rosenberg
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoAMD Developer Central
 
CSTalks - GPGPU - 19 Jan
CSTalks  -  GPGPU - 19 JanCSTalks  -  GPGPU - 19 Jan
CSTalks - GPGPU - 19 Jancstalks
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Rob Gillen
 

En vedette (18)

Gpu Systems
Gpu SystemsGpu Systems
Gpu Systems
 
Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360
Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360
Putting a Heart into a Box:GPGPU simulation of a Cardiac Model on the Xbox 360
 
Cheap HPC
Cheap HPCCheap HPC
Cheap HPC
 
GPGPU_report_v3
GPGPU_report_v3GPGPU_report_v3
GPGPU_report_v3
 
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
Umbra Ignite 2015: Graham Wihlidal – Adapting a technology stream to ever-evo...
 
GPGPU in scientifc applications
GPGPU in scientifc applicationsGPGPU in scientifc applications
GPGPU in scientifc applications
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
E-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPUE-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPU
 
GPGPU Education at Nagaoka University of Technology: A Trial Run
GPGPU Education at Nagaoka University of Technology: A Trial RunGPGPU Education at Nagaoka University of Technology: A Trial Run
GPGPU Education at Nagaoka University of Technology: A Trial Run
 
General Programming on the GPU - Confoo
General Programming on the GPU - ConfooGeneral Programming on the GPU - Confoo
General Programming on the GPU - Confoo
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
 
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
 
CSTalks - GPGPU - 19 Jan
CSTalks  -  GPGPU - 19 JanCSTalks  -  GPGPU - 19 Jan
CSTalks - GPGPU - 19 Jan
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro
 
2014/07/17 Parallelize computer vision by GPGPU computing
2014/07/17 Parallelize computer vision by GPGPU computing2014/07/17 Parallelize computer vision by GPGPU computing
2014/07/17 Parallelize computer vision by GPGPU computing
 

Similaire à GPGPU Programming @DroidconNL 2012 by Alten

Trident International Graphics Workshop 2014 1/5
Trident International Graphics Workshop 2014 1/5Trident International Graphics Workshop 2014 1/5
Trident International Graphics Workshop 2014 1/5Takao Wada
 
Cross-platform game engine development with SDL 2.0
Cross-platform game engine development with SDL 2.0Cross-platform game engine development with SDL 2.0
Cross-platform game engine development with SDL 2.0Leszek Godlewski
 
NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)Igalia
 
openGL basics for sample program (1).ppt
openGL basics for sample program (1).pptopenGL basics for sample program (1).ppt
openGL basics for sample program (1).pptHIMANKMISHRA2
 
openGL basics for sample program.ppt
openGL basics for sample program.pptopenGL basics for sample program.ppt
openGL basics for sample program.pptHIMANKMISHRA2
 
CS 354 Viewing Stuff
CS 354 Viewing StuffCS 354 Viewing Stuff
CS 354 Viewing StuffMark Kilgard
 
GPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionGPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionRichard Southern
 
Verilog overview
Verilog overviewVerilog overview
Verilog overviewposdege
 
Unconventional webapps with gwt:elemental & html5
Unconventional webapps with gwt:elemental & html5Unconventional webapps with gwt:elemental & html5
Unconventional webapps with gwt:elemental & html5firenze-gtug
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The SurgePhilip Hammer
 
Lab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer GraphicsLab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer GraphicsRup Chowdhury
 
An Introductory course on Verilog HDL-Verilog hdl ppr
An Introductory course on Verilog HDL-Verilog hdl pprAn Introductory course on Verilog HDL-Verilog hdl ppr
An Introductory course on Verilog HDL-Verilog hdl pprPrabhavathi P
 
Introduction to accelerated graphics
Introduction to accelerated graphicsIntroduction to accelerated graphics
Introduction to accelerated graphicsRuslan Novikov
 
2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkioLeo Benkel
 

Similaire à GPGPU Programming @DroidconNL 2012 by Alten (20)

Trident International Graphics Workshop 2014 1/5
Trident International Graphics Workshop 2014 1/5Trident International Graphics Workshop 2014 1/5
Trident International Graphics Workshop 2014 1/5
 
Cross-platform game engine development with SDL 2.0
Cross-platform game engine development with SDL 2.0Cross-platform game engine development with SDL 2.0
Cross-platform game engine development with SDL 2.0
 
Introduction to 2D/3D Graphics
Introduction to 2D/3D GraphicsIntroduction to 2D/3D Graphics
Introduction to 2D/3D Graphics
 
NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)
 
openGL basics for sample program (1).ppt
openGL basics for sample program (1).pptopenGL basics for sample program (1).ppt
openGL basics for sample program (1).ppt
 
openGL basics for sample program.ppt
openGL basics for sample program.pptopenGL basics for sample program.ppt
openGL basics for sample program.ppt
 
CS 354 Viewing Stuff
CS 354 Viewing StuffCS 354 Viewing Stuff
CS 354 Viewing Stuff
 
GPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionGPU Accelerated Domain Decomposition
GPU Accelerated Domain Decomposition
 
Verilog overview
Verilog overviewVerilog overview
Verilog overview
 
Unconventional webapps with gwt:elemental & html5
Unconventional webapps with gwt:elemental & html5Unconventional webapps with gwt:elemental & html5
Unconventional webapps with gwt:elemental & html5
 
201707 SER332 Lecture 06
201707 SER332 Lecture 06 201707 SER332 Lecture 06
201707 SER332 Lecture 06
 
Dissecting the Rendering of The Surge
Dissecting the Rendering of The SurgeDissecting the Rendering of The Surge
Dissecting the Rendering of The Surge
 
Debugging TV Frame 0x18
Debugging TV Frame 0x18Debugging TV Frame 0x18
Debugging TV Frame 0x18
 
Lab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer GraphicsLab Practices and Works Documentation / Report on Computer Graphics
Lab Practices and Works Documentation / Report on Computer Graphics
 
Open gl
Open glOpen gl
Open gl
 
An Introductory course on Verilog HDL-Verilog hdl ppr
An Introductory course on Verilog HDL-Verilog hdl pprAn Introductory course on Verilog HDL-Verilog hdl ppr
An Introductory course on Verilog HDL-Verilog hdl ppr
 
Introduction to accelerated graphics
Introduction to accelerated graphicsIntroduction to accelerated graphics
Introduction to accelerated graphics
 
2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio
 
Joel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMDJoel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMD
 
201707 SER332 Lecture 07
201707 SER332 Lecture 07   201707 SER332 Lecture 07
201707 SER332 Lecture 07
 

Dernier

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Dernier (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

GPGPU Programming @DroidconNL 2012 by Alten

  • 1. Technology Consulting and Engineering GPGPU Programming on Android devices Alten droidconNL 2012 ALTEN | 11/22/12
  • 2. Welcome • Alten PTS; leading service provider in the field of technical consultancy and engineering  Eindhoven, Capelle aan de IJssel and Apeldoorn • ir. Arjan Somers Alten-droidconNL 2012 Slide 2
  • 3. Goals ●What is GPGPU? ●How is it done on current Android devices? ●When is GPGPU programming useful? Alten-droidconNL 2012 Slide 3
  • 5. What ●GPGPU programming is using the GPU to perform general purpose calculations ● Data manipulation using the graphic card Alten-droidconNL 2012 Slide 5
  • 6. What A little bit of history ● ● GPU's and OpenGL ● Parallel vector based operations ● Programmable (Shaders) Battlezone (1980) Crysis 3 (2013) Alten-droidconNL 2012 Slide 6
  • 7. What A little bit of history ● ● GPU's and OpenGL ● Parallel vector based operations ● Programmable (Shaders) Alten-droidconNL 2012 Slide 7
  • 8. What: parallel proccessing Parallel vector based operations ● ● Process vertices / pixels ● Independent/parallel processing Alten-droidconNL 2012 Slide 8
  • 9. What: programmable GPU's Displacement mapping ● ● Requires control over vertex and fragment processing Alten-droidconNL 2012 Slide 9
  • 10. What: programmable GPU's OpenGL Pipeline ● Alten-droidconNL 2012 Slide 10
  • 11. What: programmable GPU's OpenGL Pipeline ● Shaders Alten-droidconNL 2012 Slide 11
  • 12. What: programmable GPU's ●Shaders Vertex Fragment void main(void) void main(void) { { gl_Position = gl_Vertex; gl_FragColor = vec4(0.0, 1.0, 0.0, 1.0); } } Alten-droidconNL 2012 Slide 12
  • 13. What: programmable GPU's A fragment shader ● void main(void) { vec2 pos = mod(gl_FragCoord.xy, vec2(50.0)) - vec2(25.0); float dist_squared = dot(pos, pos); gl_FragColor = (dist_squared < 400.0) ? vec4(.90, .90, .90, 1.0) : vec4(.20, .20, .40, 1.0); } Alten-droidconNL 2012 Slide 13
  • 14. What: General calculations Simple GPGPU ● ● Upload input data in textures ● Render quad ● Perform calculations in pixel shader ● Read back results Alten-droidconNL 2012 Slide 14
  • 15. What: General calculations Java App Alten-droidconNL 2012 Slide 15
  • 16. What: General calculations Upload() Upload() Java App Alten-droidconNL 2012 Slide 16
  • 17. What: General calculations Java App Draw() Alten-droidconNL 2012 Slide 17
  • 18. What: General calculations Java App Read() Alten-droidconNL 2012 Slide 18
  • 19. What A CPU program vs a GPU Program ● int[] a; int[] b=F(a); int[] b; Texture draw() pixel-buffer int[] a; int[] b; Alten-droidconNL 2012 Slide 19
  • 20. Why ●Additional computational power ● Can run in parallel with CPU ●Greater computational power ● Galaxy S2 ● CPU 4 GFlops ● GPU 10 GFlops Alten-droidconNL 2012 Slide 20
  • 21. How 1) Parallelize code 2) Data packing 3) Implement OpenGL ES 2.0 Shaders 4) Drawing and Input/Output Alten-droidconNL 2012 Slide 21
  • 22. How: Parallelize code Pixel-shader parallel code: public void foo(int[] a, int[] b){ int[] c = new int[a.length]; for(int i=0; i<a.length; i++){ c[i] = a[i] + 2*b[i]; } } Alten-droidconNL 2012 Slide 22
  • 23. How: Parallelize code Not Pixel-shader parallel code: (Mobile GPU have no geometry shaders) public int[] bar(int[] a, int[] b, int[] c){ int[] d = new int[a.length]; for(int i=0; i<a.length; i++){ d[b[i]] += a[i]; d[c[i]] += a[i]; } return d; } Alten-droidconNL 2012 Slide 23
  • 24. How: Parallelize code Not Pixel-shader parallel code: ● Sequence of calculations Alten-droidconNL 2012 Slide 24
  • 25. How: Data packing Current mobile GPU's only have ● 8bpp buffers and textures ● single render target Alten-droidconNL 2012 Slide 25
  • 26. How: Data packing Packing uses ● ● Floating point in/out-put ● Physics simulation requiring multiple outputs Alten-droidconNL 2012 Slide 26
  • 27. How: Implement Shaders Will be shown later in detail ● ● Use OpenGL ES 2.0 ● No CUDA, OpenCL or similar Alten-droidconNL 2012 Slide 27
  • 28. How: Drawing and Input/Output Transfer is slow ● Texture draw() pixel-buffer Int[] a; Int[] b; Alten-droidconNL 2012 Slide 28
  • 29. When: What works, what not ●Parallelism ● No geometry shaders ●Limited precision / Single Render Target ●Limited data transfer ●Not yet as fast a desktop Alten-droidconNL 2012 Slide 29
  • 30. Example AES encryption on the GPU ● “Hello droiconNL!” Encryption U2FsdGVkX18UAXwN1I7bomP0kuKNXwQ8h 2NHb8lZ5sAG6uaLjZxzkn/ik9QPv8Pq Decryption “Hello droiconNL!” Alten-droidconNL 2012 Slide 30
  • 31. Example Encoding Decoding Alten-droidconNL 2012 Slide 31
  • 32. Example Encoding Decoding Not GPU-Parallel GPU-Parallel Alten-droidconNL 2012 Slide 32
  • 34. Example Are all parts implementable on gpu? Alten-droidconNL 2012 Slide 34
  • 35. Example Are all parts implementable on gpu? Parallelizable? Packing required? Alten-droidconNL 2012 Slide 35
  • 36. Example Implementing shader Dec Hex 0 00 25 19 255 FF Alten-droidconNL 2012 Slide 36
  • 37. Example Implementing shader Dec Hex 0 00 25 19 255 FF Alten-droidconNL 2012 Slide 37
  • 38. Example Implementing shader ●Parallelizable? ●Packing? ●Steps: ● Find row/column using hex-digits ● Find new value in substitution table Alten-droidconNL 2012 Slide 38
  • 39. Example uniform sampler2D SBox; uniform sampler2D InputTexture; varying vec2 vTexCoord; void main(void) { vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255] vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0; vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0; gl_FragColor = vec4( texture2D(SBox, vec2(firstHexDigits.x, lastHexDigits.x)).x), texture2D(SBox, vec2(firstHexDigits.y, lastHexDigits.y)).x), texture2D(SBox, vec2(firstHexDigits.z, lastHexDigits.z)).x), texture2D(SBox, vec2(firstHexDigits.w, lastHexDigits.w)).x)); } Alten-droidconNL 2012 Slide 39
  • 40. Example uniform sampler2D SBox; uniform sampler2D InputTexture; Get four bytes from input varying vec2 vTexCoord; void main(void) { vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255] vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0; vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0; gl_FragColor = vec4( texture2D(SBox, vec2(firstHexDigits.x, lastHexDigits.x)).x), texture2D(SBox, vec2(firstHexDigits.y, lastHexDigits.y)).x), texture2D(SBox, vec2(firstHexDigits.z, lastHexDigits.z)).x), texture2D(SBox, vec2(firstHexDigits.w, lastHexDigits.w)).x)); } Alten-droidconNL 2012 Slide 40
  • 41. Example uniform sampler2D SBox; uniform sampler2D InputTexture; Convert from [0, 1) to [0, 255] varying vec2 vTexCoord; void main(void) { vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255] vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0; vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0; gl_FragColor = vec4( texture2D(SBox, vec2(firstHexDigits.x, lastHexDigits.x)).x), texture2D(SBox, vec2(firstHexDigits.y, lastHexDigits.y)).x), texture2D(SBox, vec2(firstHexDigits.z, lastHexDigits.z)).x), texture2D(SBox, vec2(firstHexDigits.w, lastHexDigits.w)).x)); } Alten-droidconNL 2012 Slide 41
  • 42. Example No bitshifting, but some build-in functions uniform sampler2D SBox; uniform sampler2D InputTexture; floor(X / 16) = floor(X / 2^4) = X << 4 varying vec2 vTexCoord; Scale back to [0, 1) range void main(void) { vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255] vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0; vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0; gl_FragColor = vec4( texture2D(SBox, vec2(firstHexDigits.x, lastHexDigits.x)).x), texture2D(SBox, vec2(firstHexDigits.y, lastHexDigits.y)).x), texture2D(SBox, vec2(firstHexDigits.z, lastHexDigits.z)).x), texture2D(SBox, vec2(firstHexDigits.w, lastHexDigits.w)).x)); } Alten-droidconNL 2012 Slide 42
  • 43. Example uniform sampler2D SBox; No masking, but some build-in functions uniform sampler2D InputTexture; mod(X / 16) = mod(X / 2^4) = X & 0x0F varying vec2 vTexCoord; void main(void) { vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255] vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0; vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0; gl_FragColor = vec4( texture2D(SBox, vec2(firstHexDigits.x, lastHexDigits.x)).x), texture2D(SBox, vec2(firstHexDigits.y, lastHexDigits.y)).x), texture2D(SBox, vec2(firstHexDigits.z, lastHexDigits.z)).x), texture2D(SBox, vec2(firstHexDigits.w, lastHexDigits.w)).x)); } Alten-droidconNL 2012 Slide 43
  • 44. Example uniform sampler2D SBox; uniform sampler2D InputTexture; Find the new value in the s-box varying vec2 vTexCoord; void main(void) { vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255] vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0; vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0; gl_FragColor = vec4( texture2D(SBox, vec2(firstHexDigits.x, lastHexDigits.x)).x), texture2D(SBox, vec2(firstHexDigits.y, lastHexDigits.y)).x), texture2D(SBox, vec2(firstHexDigits.z, lastHexDigits.z)).x), texture2D(SBox, vec2(firstHexDigits.w, lastHexDigits.w)).x)); } Alten-droidconNL 2012 Slide 44
  • 45. Example uniform sampler2D SBox; uniform sampler2D InputTexture; Pack four values in output pixel varying vec2 vTexCoord; void main(void) { vec4 inputValues = texture2D(InputTexture, vTexCoord) * 256.0; //four values range [0.255] vec4 firstHexDigits = floor(inputValues / 16.0) / 16.0; vec4 lastHexDigits = mod(inputValues, 16.0) / 16.0; gl_FragColor = vec4( texture2D(SBox, vec2(firstHexDigits.x, lastHexDigits.x)).x), texture2D(SBox, vec2(firstHexDigits.y, lastHexDigits.y)).x), texture2D(SBox, vec2(firstHexDigits.z, lastHexDigits.z)).x), texture2D(SBox, vec2(firstHexDigits.w, lastHexDigits.w)).x)); } Alten-droidconNL 2012 Slide 45
  • 46. My experiences ●OpenGL ES is limited vs Desktop ● Geometry shaders ● Buffer formats / no MRT's ●Sometimes difficult to debug ● Dithering ● NPOT ●Complex algorithms are possible ● Computer vision implemented ●Large speed gains are possible Alten-droidconNL 2012 Slide 46
  • 47. Conclusion ●How is GPGPU programming performed on Android devices? ● Trough the use of shaders and textures ●When is GPGPU a viable option? ● Calculations are consuming too much time ● Calculations are parallelizable ● Can be implemented using 32 bit buffers ● Limited transfer GPU-CPU memory required Alten-droidconNL 2012 Slide 47
  • 48. Conclusion ●GPGPU programming has high potential ●Mobile GPU are becoming faster ●GPGPU programming is fun Alten-droidconNL 2012 Slide 48