SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Introduction to SPIR for
Application and Compiler
Developers

Yaxun Sam Liu
OUTLINE
y What is SPIR and why it is useful
‒ Why do we need SPIR since we already have LLVM IR

y SPIR for Application Developers
‒ How to generate SPIR
‒ How to load SPIR
‒ Portability considerations using SPIR

y SPIR for Compiler Developers
‒ Introduction to SPIR spec
‒ How to implement a SPIR loader

y References

2 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
WHAT IS SPIR
y A Binary Format
‒ SPIR means Standard Portable Intermediate Representation
‒ A portable binary format for OpenCLTM programs
‒ Defined by SPIR spec
‒ Based on LLVM IR
‒ Supports most of OpenCLTM core features
‒ Current version is 1.2, corresponding to OpenCLTM 1.2
‒ Developed by Khronos Group, OpenCLTM working group, SPIR subgroup
‒ A SPIR binary is bitness aware, means
‒ The pointer size in a SPIR binary is either 32 bit or 64 bit depending on target devices
‒ Two sets of SPIR binaries are needed for shipping products in SPIR to both 32 and 64 bit devices

y An extension for OpenCLTM
‒ Defined by SPIR host API
‒ Denoted by cl_khr_spir
‒ OpenCLTM devices supporting cl_khr_spir is able to load SPIR binary and run it
3 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
WHY IS SPIR USEFUL
y Why is SPIR useful
‒ For Game/Application Developers
‒ Can ship OpenCLTM program in binary instead of source code
‒ Can ship just a few binaries for one OpenCLTM program instead of tons of binaries for different platforms/devices

‒ For Compiler Developers
‒ Can compile other programming languages to SPIR which can be run on OpenCLTM devices

4 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO GENERATE SPIR
y SPIR generation is optional for devices supporting cl_khr_spir
‒ A device supporting cl_khr_spir is only required to be able to consume SPIR
‒ Whether to support SPIR generation is vendors’ choice

y Generating SPIR in host program
‒ SPIR spec and host API does not define how to generate SPIR
‒ If SPIR generation is supported, it is likely to be done as
‒
‒
‒
‒

Load OpenCLTM source code by clCreateProgramWithSource
Compile OpenCLTM source code by clCompileProgram with a vendor specific option for generating SPIR
Get the SPIR binary by clGetProgramInfo with CL_PROGRAM_BINARIES
Save the SPIR binary to a file

y Generating SPIR by offline compiler
‒ Clang 3.3/3.4 can compile OpenCLTM source code to SPIR-like LLVM bitcode
‒ A patch for Clang 3.2 is available to Khronos members which can compile OpenCLTM source code to SPIR 1.2
‒ Clang options for generating SPIR: -cl-std=CL1.2 -emit-llvm -triple spir[32|64]-unknown-unknown
5 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO LOAD SPIR
LOAD A SINGLE SPIR BINARY
SPIR Binary
clCreateProgramWithBinary
cl_program
clBuildProgram
cl_program
clCreateKernel
cl_kernel

6 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO LOAD SPIR
MULTIPLE SPIR BINARIES, OPENCLTM SOURCE CODES AND VENDOR-SPECIFIC BINARIES
OpenCLTM Source

SPIR Binary

Vendor-specific Binary

clCreateProgramWithSource

clCreateProgramWithBinary

clCreateProgramWithBinary

cl_program

cl_program

cl_program

clCompileProgram

clCompileProgram

cl_program

cl_program
clLinkProgram
cl_program
clCreateKernel
cl_kernel

7 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
PORTABILITY CONSIDERATIONS USING SPIR
y Check whether a device supports SPIR
‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS
‒ Check whether cl_khr_spir is included

y Supporting both 32 and 64 bit devices
‒ Two sets of SPIR binaries are needed, one for 32 bit devices, the other for 64 bit devices
‒ Check bitness of a device by clGetDeviceInfo with CL_DEVICE_ADDRESS_BITS
‒ Load 32 or 64 bit SPIR binaries accordingly

y Supporting optional extensions
‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS
‒ Check if the required extension is supported
‒ If yes, load the SPIR binary
‒ If no, either fallback to a SPIR binary or OpenCLTM source using only core extensions, or fail gracefully

8 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
PORTABILITY CONSIDERATIONS USING SPIR
y SPIR binaries generated from non-portable OpenCLTM source is not portable
‒ Not following restrictions specified by OpenCLTM spec 1.2 section 6.9
‒ Casting a pointer from one address space to a different address space
‒ Casting an OpenCLTM opaque structure to a different type
‒ Performing arithmetic operations or comparison on a sampler
‒ Performing sizeof on OpenCLTM opaque structures
‒ etc.

9 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
SPIR FOR COMPILER DEVELOPERS
y Introduction to SPIR spec
‒ Relation between SPIR 1.2 and LLVM 3.2
‒ Mapping of OpenCLTM to SPIR
‒
‒
‒
‒
‒
‒
‒

Data types
Enumeration values
Calling conventions
Address spaces
Name mangling
Used extensions
Kernel argument info

y How to implement a SPIR loader
‒ Overall structure
‒ Transforming data types
‒ Transforming meta data
‒ Demangling and mapping builtin function names
10 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
RELATION BETWEEN SPIR AND LLVM BITCODE
y SPIR binary is subset of LLVM bitcode
‒ A valid SPIR 1.2 binary is valid LLVM 3.2 bitcode
‒ SPIR is defined by mapping OpenCLTM C entities to LLVM and also imposing restrictions on LLVM 3.2 bitcode format
‒
‒
‒
‒

Specific target triple and data layout for 32 and 64 bit devices
Specific ABI
Specific calling conventions
Restrictions on allowed instructions, intrinsic functions, linkage types, parameter attributes, visibility styles, function
attributes, etc.

y The ideas behind SPIR
‒ To be expressive enough to represent OpenCLTM C programs
‒ To carry enough information for OpenCLTM runtime to execute and query the kernels
‒ Do not introduce unnecessary entities
‒ This may limit SPIR’s expressiveness for other languages, but facilitates development of SPIR loader
‒ Balance the burden between SPIR producer and loader

11 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
DATA TYPES

y OpenCLTM builtin scalar types are mapped to LLVM primitive types
‒ bool -> i1
‒ char -> i8
‒ unsigned char, uchar -> i8
‒ short -> i16
‒ unsigned short, ushort -> i16
‒ int -> i32
‒ unsigned int, uint -> i32
‒ long -> i64
‒ unsigned long, ulong -> i64
‒ float -> float
‒ double -> double
‒ half -> half
‒ void -> void

y OpenCLTM builtin vector types are mapped to LLVM vector types
‒ charn < n x i8 >

12 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
DATA TYPES

y Image and event types are mapped to LLVM opaque structure
‒ image1d_t -> %opencl.image1d_t
‒ image1d_array_t -> %opencl.image1d_array_t
‒ image1d_buer_t -> %opencl.image1d_buer_t
‒ image2d_t -> %opencl.image2d_t
‒ image2d_array_t -> %opencl.image2d_array_t
‒ image3d_t -> %opencl.image3d_t
‒ image2d_msaa_t -> %opencl.image2d_msaa_t
‒ image2d_array_msaa_t -> %opencl.image2d_array_msaa_t
‒ image2d_msaa_depth_t -> %opencl.image2d_msaa_depth_t
‒ image2d_array_msaa_depth_t -> %opencl.image2d_array_msaa_depth_t
‒ image2d_depth_t -> %opencl.image2d_depth_t
‒ image2d_array_depth_t -> %opencl.image2d_array_depth_t
‒ event_t -> %opencl.event_t

13 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
DATA TYPES

y Sampler type is mapped to LLVM i32 type
‒ Although sampler is represented by integer in SPIR, arithmetic operations and comparison with other values are not
allowed.

y size_t, diffptr_t, intptr_t, uintptr_t is mapped to LLVM i32 or i64 depending on the bitness of SPIR
y Signed/unsignedness of integer types
‒ LLVM does not have unsigned integer types
‒ OpenCLTM unsigned and signed integer types of the same bit width are mapped to the same type in SPIR
‒ If signed/unsignedness of an integer type is needed, usually the information can be obtained through
‒ Mangled function names
‒ Sign extension of function arguments and return type
‒ Kernel argument metadata

14 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
CALLING CONVENTIONS

y SPIR uses calling convention to indicate whether a function is a kernel function
‒ Kernel functions use spir_kernel calling convention
‒ Non-kernel functions use spir_func calling convention
‒ No other calling conventions are allowed in SPIR

15 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
ADDRESS SPACES

y OpenCLTM C address spaces are mapped to LLVM address spaces
‒ Private -> 0
‒ Global -> 1
‒ Constant -> 2
‒ Local -> 3

y Casting a pointer to a different address space is not allowed
y OpenCLTM C function-level local variables are mapped to LLVM module scope global variables
‒ The variable name is mapped as <function name>.<variable name>

16 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
ENUMERATION VALUES

y SPIR defines enumeration values used by OpenCLTM C programs
‒ Image channel order -> same as cl.h
‒ Image data type -> same as cl.h
‒ Sampler enumeration values (based on cl.h but not exactly the same)
‒ Addressing mode
‒ CLK_ADDRESS_NONE=0x0000
‒ CLK_ADDRESS_CLAMP_TO_EDGE=0x0002
‒ CLK_ADDRESS_CLAMP=0x0004
‒ CLK_ADDRESS_REPEAT=0x0006
‒ CLK_ADDRESS_MIRRORED_REPEAT=0x0008

‒ Normalized coords
‒ CLK_NORMALIZED_COORDS_FALSE=0x0000
‒ CLK_NORMALIZED_COORDS_TRUE=0x0001

‒ Filter mode
‒ CLK_FILTER_NEAREST=0x0010
‒ CLK_FILTER_LINEAR=_0x0020

17 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
NAME MANGLING

y OpenCLTM C builtin functions are mangled
y OpenCLTM C kernel functions and non-kernel user functions are not mangled
y Other languages may choose to mangle non-kernel user functions
y SPIR adopts name mangling scheme of Itanium C++ ABI section 5.1 with extended rules for OpenCLTM C
data types, address spaces, access qualifiers
‒ Unsigned/signed integer types of the same bit widths are mangled to different names
‒ Pointers of non-private address space N -> PU3ASN<mangled element type>
‒ Vector type of N elements -> DvN_<mangled element type>
‒ OpenCLTM C opaque types (image, sampler, event) -> <string length>ocl_<type name> e.g.
‒ sampler_t -> 11ocl_sampler

‒ Access qualifiers: read only -> U1R, write only -> U1W, read write -> U1B
‒ size_t and uintptr_t are treated as uint or ulong
‒ Ptrdiff_t and intptr_t are treated as int or long

18 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
USE OF OPTIONAL CORE FEATURES AND EXTENSIONS

y SPIR contains information about used optional features and extensions
‒ Runtime can reject SPIR binaries using unsupported optional features
‒ Application can select SPIR binaries based on used optional features and extensions

y Metadata for used core features
‒ openclTM.used.optional.core.features
‒ Two core features are allowed:
‒ cl_image: indicates images are used
‒ cl_double: indicates doubles are used

y Metadata for used extensions
‒ openclTM.used.extensions
‒
‒
‒
‒
‒
‒

cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_fp16
cl_khr_gl_sharing
cl_khr_gl_event
etc

19 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
KERNEL ATTRIBUTES

y SPIR contains information about optional kernel attributes
‒ Reqd_work_group_size
‒ Work_group_size_hint
‒ Vec_type_hint

y For each kernel, there is a metadata for optional kernel attributes
‒ !opencl.kernels = {!0, !1, ..., !N}
‒
‒
‒
‒

!0 = metadata { < function signature >, !01, !02, ..., , !0i }
!1 = metadata { < function signature >, !11, !12, ..., , !1j }
...
!N = metadata { < function signature >, !N1, !N2, ..., , !Nk }

20 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
KERNEL ARGUMENT INFO

y SPIR contains kernel argument information required by OpenCLTM runtime for executing kernels
y For each kernel argument, there is metadata
‒ kernel_arg_addr_space
‒ kernel_arg_access_qual
‒ kernel_arg_type
‒ kernel_arg_base_type
‒ kernel_arg_type_qual
‒ kernel_arg_name : optional, only exists if -cl-kernel-arg-info is used when producing SPIR

21 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
SPIR ABI
y SPIR uses the default ABI of Clang 3.2
‒ Any aggregate type is passed as a pointer. Memory allocation (if needed) is the responsibility of the caller function.
‒ Enumeration types are handled as the underlying integer type.
‒ If the argument type is a promotable integer type, it will be extended according to the C99 integer promotion rules.
‒ Any other type, including floating point types, vectors, etc.. will be passed directly as the corresponding LLVM type.

22 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO IMPLEMENT SPIR LOADER
OVERALL STRUCTURE – IDEAL CASE
User’s OpenCLTM Source

User’s SPIR Binary

Builtin Library Source

compile

compile

SPIR Binary

SPIR Binary

Optimize, link
Linked SPIR Binary
Optimize, codegen
Executable Kernels

y Backend consumes SPIR directly without transforming to vendor’s LLVM format
23 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO IMPLEMENT SPIR LOADER
OVERALL STRUCTURE – ACTUAL CASE
User’s OpenCLTM Source

User’s SPIR Binary

Builtin Library Source

compile

SPIR loader

compile

Vendor’s LLVM Binary

Vendor’s LLVM Binary

Vendor’s LLVM Binary

Optimize, link
Vendor’s Linked Binary
Optimize, codegen
Executable Kernels

y Backend transforms SPIR to vendor’s LLVM format
24 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
WHY IS SPIR LOADER NEEDED
y Vendor uses different LLVM entities or format to convey information required by OpenCLTM runtime for
querying and executing kernels
y Vendor’s frontend does special transformations which are not done by SPIR producer
y Vendor’s backend is shared by different frontends, some of which do not generate SPIR
y Vendor’s builtin library uses different name mangling scheme

25 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO IMPLEMENT SPIR LOADER
y Verify SPIR target triple and data layout is compatible with target device
y Set target triple for target device
y Demangle builtin functions and re-mangle them using vendor’s name mangling scheme
y Transform data types
y Transform metadata
y Transform calling conventions
y Perform special transformations done by frontend
‒ If possible, consider moving the transformations from frontend to backend

26 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
SPIR CONFORMANCE TEST
y SPIR is a Khronos extension
y To claim supporting SPIR, vendor’s OpenCLTM implementation needs to pass SPIR conformance test
y SPIR 1.2 conformance test is going to be part of OpenCLTM 1.2 conformance test

27 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
REFERENCES
y Khronos OpenCLTM Working Group SPIR subgroup, SPIR provisional spec http://www.khronos.org/files/
opencl-spir-12-provisional.pdf, version 1.2.
y LLVM Team. LLVM Bitcode File Format. http://www.llvm.org/releases/3.2/docs/BitCodeFormat.html,
2012. Version 3.2.
y CodeSourcery, Compaq, EDG, HP, IBM, Intel, Red Hat, SGI, and others. Itanium C++ ABI. http://
mentorembedded.github.com/cxx-abi/abi.html .
y Khronos OpenCLTM Working Group. The OpenCLTM Specication, version 1.2. http://www.khronos.org/
registry/cl/specs/opencl-1.2.pdf, November 2012.
y LLVM Team. LLVM Language Reference Manual. http://www.llvm.org/releases/3.2/docs/LangRef.html ,
2012. Version 3.2.

28 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
DISCLAIMER & ATTRIBUTION
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers,
software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information.
However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to
notify any person of such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD
BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

ATTRIBUTION
© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. OpenCLTM is a trademark of Apple Inc. Other names are for informational purposes only and may be
trademarks of their respective owners.
29 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013

Contenu connexe

Tendances

IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelAMD Developer Central
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...AMD Developer Central
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...AMD Developer Central
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerAMD Developer Central
 
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...AMD Developer Central
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningAMD Developer Central
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesAMD Developer Central
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahAMD Developer Central
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosAMD Developer Central
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovAMD Developer Central
 
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...AMD Developer Central
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsAMD Developer Central
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...AMD Developer Central
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...AMD Developer Central
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...AMD Developer Central
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorAMD Developer Central
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoAMD Developer Central
 
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...AMD Developer Central
 

Tendances (20)

IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
 
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
 
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
 

Similaire à PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu

HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopLinaro
 
PHP QA Tools
PHP QA ToolsPHP QA Tools
PHP QA Toolsrjsmelo
 
Cloud Native APIs: The API Operator for Kubernetes
Cloud Native APIs: The API Operator for KubernetesCloud Native APIs: The API Operator for Kubernetes
Cloud Native APIs: The API Operator for KubernetesWSO2
 
OpenDataPlane - Bill Fischofer
OpenDataPlane - Bill FischoferOpenDataPlane - Bill Fischofer
OpenDataPlane - Bill Fischoferharryvanhaaren
 
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - LinaroSummit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - LinaroOPNFV
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...chiportal
 
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...Srijan Technologies
 
OpenDDR and Jakarta MVC - JavaLand 2021
OpenDDR and Jakarta MVC - JavaLand 2021OpenDDR and Jakarta MVC - JavaLand 2021
OpenDDR and Jakarta MVC - JavaLand 2021Werner Keil
 
Build and deploy scientific Python Applications
Build and deploy scientific Python Applications  Build and deploy scientific Python Applications
Build and deploy scientific Python Applications Ramakrishna Reddy
 
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarPrinceton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarTimothy Spann
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Timothy Spann
 
HKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateHKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateLinaro
 
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...IO Visor Project
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Databricks
 
Presentation 4 rifidi emulator lab
Presentation 4 rifidi emulator labPresentation 4 rifidi emulator lab
Presentation 4 rifidi emulator labMouhanad Alkhaldi
 
2022 APIsecure_Securing APIs with Open Standards
2022 APIsecure_Securing APIs with Open Standards2022 APIsecure_Securing APIs with Open Standards
2022 APIsecure_Securing APIs with Open StandardsAPIsecure_ Official
 
FIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media ServerFIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media ServerFIWARE
 
OpenCR tutorial_icra2017
OpenCR tutorial_icra2017 OpenCR tutorial_icra2017
OpenCR tutorial_icra2017 chcbaram
 

Similaire à PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu (20)

HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
PHP QA Tools
PHP QA ToolsPHP QA Tools
PHP QA Tools
 
Cloud Native APIs: The API Operator for Kubernetes
Cloud Native APIs: The API Operator for KubernetesCloud Native APIs: The API Operator for Kubernetes
Cloud Native APIs: The API Operator for Kubernetes
 
OpenDataPlane - Bill Fischofer
OpenDataPlane - Bill FischoferOpenDataPlane - Bill Fischofer
OpenDataPlane - Bill Fischofer
 
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - LinaroSummit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - Linaro
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
 
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
 
OpenDDR and Jakarta MVC - JavaLand 2021
OpenDDR and Jakarta MVC - JavaLand 2021OpenDDR and Jakarta MVC - JavaLand 2021
OpenDDR and Jakarta MVC - JavaLand 2021
 
Build and deploy scientific Python Applications
Build and deploy scientific Python Applications  Build and deploy scientific Python Applications
Build and deploy scientific Python Applications
 
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarPrinceton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
 
HKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateHKG15-110: ODP Project Update
HKG15-110: ODP Project Update
 
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
 
OpenDDR
OpenDDROpenDDR
OpenDDR
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Presentation 4 rifidi emulator lab
Presentation 4 rifidi emulator labPresentation 4 rifidi emulator lab
Presentation 4 rifidi emulator lab
 
2022 APIsecure_Securing APIs with Open Standards
2022 APIsecure_Securing APIs with Open Standards2022 APIsecure_Securing APIs with Open Standards
2022 APIsecure_Securing APIs with Open Standards
 
TFI2014 Session II - Requirements for SDN - Brian Field
TFI2014 Session II - Requirements for SDN - Brian FieldTFI2014 Session II - Requirements for SDN - Brian Field
TFI2014 Session II - Requirements for SDN - Brian Field
 
FIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media ServerFIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media Server
 
OpenCR tutorial_icra2017
OpenCR tutorial_icra2017 OpenCR tutorial_icra2017
OpenCR tutorial_icra2017
 

Plus de AMD Developer Central

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsAMD Developer Central
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozAMD Developer Central
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellAMD Developer Central
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornAMD Developer Central
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 

Plus de AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 

Dernier

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Dernier (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu

  • 1. Introduction to SPIR for Application and Compiler Developers Yaxun Sam Liu
  • 2. OUTLINE y What is SPIR and why it is useful ‒ Why do we need SPIR since we already have LLVM IR y SPIR for Application Developers ‒ How to generate SPIR ‒ How to load SPIR ‒ Portability considerations using SPIR y SPIR for Compiler Developers ‒ Introduction to SPIR spec ‒ How to implement a SPIR loader y References 2 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 3. WHAT IS SPIR y A Binary Format ‒ SPIR means Standard Portable Intermediate Representation ‒ A portable binary format for OpenCLTM programs ‒ Defined by SPIR spec ‒ Based on LLVM IR ‒ Supports most of OpenCLTM core features ‒ Current version is 1.2, corresponding to OpenCLTM 1.2 ‒ Developed by Khronos Group, OpenCLTM working group, SPIR subgroup ‒ A SPIR binary is bitness aware, means ‒ The pointer size in a SPIR binary is either 32 bit or 64 bit depending on target devices ‒ Two sets of SPIR binaries are needed for shipping products in SPIR to both 32 and 64 bit devices y An extension for OpenCLTM ‒ Defined by SPIR host API ‒ Denoted by cl_khr_spir ‒ OpenCLTM devices supporting cl_khr_spir is able to load SPIR binary and run it 3 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 4. WHY IS SPIR USEFUL y Why is SPIR useful ‒ For Game/Application Developers ‒ Can ship OpenCLTM program in binary instead of source code ‒ Can ship just a few binaries for one OpenCLTM program instead of tons of binaries for different platforms/devices ‒ For Compiler Developers ‒ Can compile other programming languages to SPIR which can be run on OpenCLTM devices 4 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 5. HOW TO GENERATE SPIR y SPIR generation is optional for devices supporting cl_khr_spir ‒ A device supporting cl_khr_spir is only required to be able to consume SPIR ‒ Whether to support SPIR generation is vendors’ choice y Generating SPIR in host program ‒ SPIR spec and host API does not define how to generate SPIR ‒ If SPIR generation is supported, it is likely to be done as ‒ ‒ ‒ ‒ Load OpenCLTM source code by clCreateProgramWithSource Compile OpenCLTM source code by clCompileProgram with a vendor specific option for generating SPIR Get the SPIR binary by clGetProgramInfo with CL_PROGRAM_BINARIES Save the SPIR binary to a file y Generating SPIR by offline compiler ‒ Clang 3.3/3.4 can compile OpenCLTM source code to SPIR-like LLVM bitcode ‒ A patch for Clang 3.2 is available to Khronos members which can compile OpenCLTM source code to SPIR 1.2 ‒ Clang options for generating SPIR: -cl-std=CL1.2 -emit-llvm -triple spir[32|64]-unknown-unknown 5 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 6. HOW TO LOAD SPIR LOAD A SINGLE SPIR BINARY SPIR Binary clCreateProgramWithBinary cl_program clBuildProgram cl_program clCreateKernel cl_kernel 6 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 7. HOW TO LOAD SPIR MULTIPLE SPIR BINARIES, OPENCLTM SOURCE CODES AND VENDOR-SPECIFIC BINARIES OpenCLTM Source SPIR Binary Vendor-specific Binary clCreateProgramWithSource clCreateProgramWithBinary clCreateProgramWithBinary cl_program cl_program cl_program clCompileProgram clCompileProgram cl_program cl_program clLinkProgram cl_program clCreateKernel cl_kernel 7 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 8. PORTABILITY CONSIDERATIONS USING SPIR y Check whether a device supports SPIR ‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS ‒ Check whether cl_khr_spir is included y Supporting both 32 and 64 bit devices ‒ Two sets of SPIR binaries are needed, one for 32 bit devices, the other for 64 bit devices ‒ Check bitness of a device by clGetDeviceInfo with CL_DEVICE_ADDRESS_BITS ‒ Load 32 or 64 bit SPIR binaries accordingly y Supporting optional extensions ‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS ‒ Check if the required extension is supported ‒ If yes, load the SPIR binary ‒ If no, either fallback to a SPIR binary or OpenCLTM source using only core extensions, or fail gracefully 8 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 9. PORTABILITY CONSIDERATIONS USING SPIR y SPIR binaries generated from non-portable OpenCLTM source is not portable ‒ Not following restrictions specified by OpenCLTM spec 1.2 section 6.9 ‒ Casting a pointer from one address space to a different address space ‒ Casting an OpenCLTM opaque structure to a different type ‒ Performing arithmetic operations or comparison on a sampler ‒ Performing sizeof on OpenCLTM opaque structures ‒ etc. 9 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 10. SPIR FOR COMPILER DEVELOPERS y Introduction to SPIR spec ‒ Relation between SPIR 1.2 and LLVM 3.2 ‒ Mapping of OpenCLTM to SPIR ‒ ‒ ‒ ‒ ‒ ‒ ‒ Data types Enumeration values Calling conventions Address spaces Name mangling Used extensions Kernel argument info y How to implement a SPIR loader ‒ Overall structure ‒ Transforming data types ‒ Transforming meta data ‒ Demangling and mapping builtin function names 10 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 11. RELATION BETWEEN SPIR AND LLVM BITCODE y SPIR binary is subset of LLVM bitcode ‒ A valid SPIR 1.2 binary is valid LLVM 3.2 bitcode ‒ SPIR is defined by mapping OpenCLTM C entities to LLVM and also imposing restrictions on LLVM 3.2 bitcode format ‒ ‒ ‒ ‒ Specific target triple and data layout for 32 and 64 bit devices Specific ABI Specific calling conventions Restrictions on allowed instructions, intrinsic functions, linkage types, parameter attributes, visibility styles, function attributes, etc. y The ideas behind SPIR ‒ To be expressive enough to represent OpenCLTM C programs ‒ To carry enough information for OpenCLTM runtime to execute and query the kernels ‒ Do not introduce unnecessary entities ‒ This may limit SPIR’s expressiveness for other languages, but facilitates development of SPIR loader ‒ Balance the burden between SPIR producer and loader 11 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 12. MAPPING OF OPENCLTM TO SPIR DATA TYPES y OpenCLTM builtin scalar types are mapped to LLVM primitive types ‒ bool -> i1 ‒ char -> i8 ‒ unsigned char, uchar -> i8 ‒ short -> i16 ‒ unsigned short, ushort -> i16 ‒ int -> i32 ‒ unsigned int, uint -> i32 ‒ long -> i64 ‒ unsigned long, ulong -> i64 ‒ float -> float ‒ double -> double ‒ half -> half ‒ void -> void y OpenCLTM builtin vector types are mapped to LLVM vector types ‒ charn < n x i8 > 12 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 13. MAPPING OF OPENCLTM TO SPIR DATA TYPES y Image and event types are mapped to LLVM opaque structure ‒ image1d_t -> %opencl.image1d_t ‒ image1d_array_t -> %opencl.image1d_array_t ‒ image1d_buer_t -> %opencl.image1d_buer_t ‒ image2d_t -> %opencl.image2d_t ‒ image2d_array_t -> %opencl.image2d_array_t ‒ image3d_t -> %opencl.image3d_t ‒ image2d_msaa_t -> %opencl.image2d_msaa_t ‒ image2d_array_msaa_t -> %opencl.image2d_array_msaa_t ‒ image2d_msaa_depth_t -> %opencl.image2d_msaa_depth_t ‒ image2d_array_msaa_depth_t -> %opencl.image2d_array_msaa_depth_t ‒ image2d_depth_t -> %opencl.image2d_depth_t ‒ image2d_array_depth_t -> %opencl.image2d_array_depth_t ‒ event_t -> %opencl.event_t 13 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 14. MAPPING OF OPENCLTM TO SPIR DATA TYPES y Sampler type is mapped to LLVM i32 type ‒ Although sampler is represented by integer in SPIR, arithmetic operations and comparison with other values are not allowed. y size_t, diffptr_t, intptr_t, uintptr_t is mapped to LLVM i32 or i64 depending on the bitness of SPIR y Signed/unsignedness of integer types ‒ LLVM does not have unsigned integer types ‒ OpenCLTM unsigned and signed integer types of the same bit width are mapped to the same type in SPIR ‒ If signed/unsignedness of an integer type is needed, usually the information can be obtained through ‒ Mangled function names ‒ Sign extension of function arguments and return type ‒ Kernel argument metadata 14 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 15. MAPPING OF OPENCLTM TO SPIR CALLING CONVENTIONS y SPIR uses calling convention to indicate whether a function is a kernel function ‒ Kernel functions use spir_kernel calling convention ‒ Non-kernel functions use spir_func calling convention ‒ No other calling conventions are allowed in SPIR 15 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 16. MAPPING OF OPENCLTM TO SPIR ADDRESS SPACES y OpenCLTM C address spaces are mapped to LLVM address spaces ‒ Private -> 0 ‒ Global -> 1 ‒ Constant -> 2 ‒ Local -> 3 y Casting a pointer to a different address space is not allowed y OpenCLTM C function-level local variables are mapped to LLVM module scope global variables ‒ The variable name is mapped as <function name>.<variable name> 16 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 17. MAPPING OF OPENCLTM TO SPIR ENUMERATION VALUES y SPIR defines enumeration values used by OpenCLTM C programs ‒ Image channel order -> same as cl.h ‒ Image data type -> same as cl.h ‒ Sampler enumeration values (based on cl.h but not exactly the same) ‒ Addressing mode ‒ CLK_ADDRESS_NONE=0x0000 ‒ CLK_ADDRESS_CLAMP_TO_EDGE=0x0002 ‒ CLK_ADDRESS_CLAMP=0x0004 ‒ CLK_ADDRESS_REPEAT=0x0006 ‒ CLK_ADDRESS_MIRRORED_REPEAT=0x0008 ‒ Normalized coords ‒ CLK_NORMALIZED_COORDS_FALSE=0x0000 ‒ CLK_NORMALIZED_COORDS_TRUE=0x0001 ‒ Filter mode ‒ CLK_FILTER_NEAREST=0x0010 ‒ CLK_FILTER_LINEAR=_0x0020 17 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 18. MAPPING OF OPENCLTM TO SPIR NAME MANGLING y OpenCLTM C builtin functions are mangled y OpenCLTM C kernel functions and non-kernel user functions are not mangled y Other languages may choose to mangle non-kernel user functions y SPIR adopts name mangling scheme of Itanium C++ ABI section 5.1 with extended rules for OpenCLTM C data types, address spaces, access qualifiers ‒ Unsigned/signed integer types of the same bit widths are mangled to different names ‒ Pointers of non-private address space N -> PU3ASN<mangled element type> ‒ Vector type of N elements -> DvN_<mangled element type> ‒ OpenCLTM C opaque types (image, sampler, event) -> <string length>ocl_<type name> e.g. ‒ sampler_t -> 11ocl_sampler ‒ Access qualifiers: read only -> U1R, write only -> U1W, read write -> U1B ‒ size_t and uintptr_t are treated as uint or ulong ‒ Ptrdiff_t and intptr_t are treated as int or long 18 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 19. MAPPING OF OPENCLTM TO SPIR USE OF OPTIONAL CORE FEATURES AND EXTENSIONS y SPIR contains information about used optional features and extensions ‒ Runtime can reject SPIR binaries using unsupported optional features ‒ Application can select SPIR binaries based on used optional features and extensions y Metadata for used core features ‒ openclTM.used.optional.core.features ‒ Two core features are allowed: ‒ cl_image: indicates images are used ‒ cl_double: indicates doubles are used y Metadata for used extensions ‒ openclTM.used.extensions ‒ ‒ ‒ ‒ ‒ ‒ cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_event etc 19 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 20. MAPPING OF OPENCLTM TO SPIR KERNEL ATTRIBUTES y SPIR contains information about optional kernel attributes ‒ Reqd_work_group_size ‒ Work_group_size_hint ‒ Vec_type_hint y For each kernel, there is a metadata for optional kernel attributes ‒ !opencl.kernels = {!0, !1, ..., !N} ‒ ‒ ‒ ‒ !0 = metadata { < function signature >, !01, !02, ..., , !0i } !1 = metadata { < function signature >, !11, !12, ..., , !1j } ... !N = metadata { < function signature >, !N1, !N2, ..., , !Nk } 20 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 21. MAPPING OF OPENCLTM TO SPIR KERNEL ARGUMENT INFO y SPIR contains kernel argument information required by OpenCLTM runtime for executing kernels y For each kernel argument, there is metadata ‒ kernel_arg_addr_space ‒ kernel_arg_access_qual ‒ kernel_arg_type ‒ kernel_arg_base_type ‒ kernel_arg_type_qual ‒ kernel_arg_name : optional, only exists if -cl-kernel-arg-info is used when producing SPIR 21 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 22. SPIR ABI y SPIR uses the default ABI of Clang 3.2 ‒ Any aggregate type is passed as a pointer. Memory allocation (if needed) is the responsibility of the caller function. ‒ Enumeration types are handled as the underlying integer type. ‒ If the argument type is a promotable integer type, it will be extended according to the C99 integer promotion rules. ‒ Any other type, including floating point types, vectors, etc.. will be passed directly as the corresponding LLVM type. 22 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 23. HOW TO IMPLEMENT SPIR LOADER OVERALL STRUCTURE – IDEAL CASE User’s OpenCLTM Source User’s SPIR Binary Builtin Library Source compile compile SPIR Binary SPIR Binary Optimize, link Linked SPIR Binary Optimize, codegen Executable Kernels y Backend consumes SPIR directly without transforming to vendor’s LLVM format 23 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 24. HOW TO IMPLEMENT SPIR LOADER OVERALL STRUCTURE – ACTUAL CASE User’s OpenCLTM Source User’s SPIR Binary Builtin Library Source compile SPIR loader compile Vendor’s LLVM Binary Vendor’s LLVM Binary Vendor’s LLVM Binary Optimize, link Vendor’s Linked Binary Optimize, codegen Executable Kernels y Backend transforms SPIR to vendor’s LLVM format 24 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 25. WHY IS SPIR LOADER NEEDED y Vendor uses different LLVM entities or format to convey information required by OpenCLTM runtime for querying and executing kernels y Vendor’s frontend does special transformations which are not done by SPIR producer y Vendor’s backend is shared by different frontends, some of which do not generate SPIR y Vendor’s builtin library uses different name mangling scheme 25 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 26. HOW TO IMPLEMENT SPIR LOADER y Verify SPIR target triple and data layout is compatible with target device y Set target triple for target device y Demangle builtin functions and re-mangle them using vendor’s name mangling scheme y Transform data types y Transform metadata y Transform calling conventions y Perform special transformations done by frontend ‒ If possible, consider moving the transformations from frontend to backend 26 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 27. SPIR CONFORMANCE TEST y SPIR is a Khronos extension y To claim supporting SPIR, vendor’s OpenCLTM implementation needs to pass SPIR conformance test y SPIR 1.2 conformance test is going to be part of OpenCLTM 1.2 conformance test 27 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 28. REFERENCES y Khronos OpenCLTM Working Group SPIR subgroup, SPIR provisional spec http://www.khronos.org/files/ opencl-spir-12-provisional.pdf, version 1.2. y LLVM Team. LLVM Bitcode File Format. http://www.llvm.org/releases/3.2/docs/BitCodeFormat.html, 2012. Version 3.2. y CodeSourcery, Compaq, EDG, HP, IBM, Intel, Red Hat, SGI, and others. Itanium C++ ABI. http:// mentorembedded.github.com/cxx-abi/abi.html . y Khronos OpenCLTM Working Group. The OpenCLTM Specication, version 1.2. http://www.khronos.org/ registry/cl/specs/opencl-1.2.pdf, November 2012. y LLVM Team. LLVM Language Reference Manual. http://www.llvm.org/releases/3.2/docs/LangRef.html , 2012. Version 3.2. 28 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 29. DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. OpenCLTM is a trademark of Apple Inc. Other names are for informational purposes only and may be trademarks of their respective owners. 29 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013