SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
scikit-cuda
A Python Interface to GPU-Powered Libraries
Lev Givon
https://www.columbia.edu/~lev
Bionet Group
Department of Electrical Engineering
Columbia University
September 24, 2015
Lev Givon scikit-cuda
Motivation
NVIDIA CUDA contains a (growing) range of great GPU-based
libraries (CUBLAS, CUFFT, CURAND, CUSPARSE, CUSOLVER,
NPP) in addition to the core driver and runtime libraries.
Other libraries (both commercial and free) also available:
CULA - http://www.culatools.com/
MAGMA - http://icl.cs.utk.edu/magma/
PyCUDA (http://mathema.tician.de/software/pycuda/)
enables Python programmers to take advantage of CUDA, but
provides few interfaces to the above libraries.
Lev Givon scikit-cuda
Package Description
Exposes contents of various GPU-based libraries via both low-level
and high-level Python functions.
Project origin: using CULA’s GPU-based SVD routines to accelerate
time decoding algorithms implemented in Python. 1
Low-level functions
wrap C functions using ctypes
(https://docs.python.org/2/library/ctypes.html);
replicate C function signatures (with a few slight differences);
catch errors and map error codes to Python exceptions.
High-level functions
provide a similar interface to functions in scipy (http://scipy.org);
use PyCUDA’s GPUArray class to encapsulate matrices/vectors in
GPU memory.
1http://www.bionet.ee.columbia.edu/publications
Lev Givon scikit-cuda
Example
1 import pycuda.gpuarray as gpuarray
2 import pycuda.autoinit
3 import numpy as np
4 import skcuda.linalg as linalg
5
6 linalg.init()
7 a = np.random.randn(9, 6) + 1j*np.random.randn(9, 6)
8 a = np.asarray(a, np.complex64)
9 a_gpu = gpuarray.to_gpu(a)
10 u_gpu, s_gpu, vh_gpu = linalg.svd(a_gpu, ’S’, ’S’)
11 np.allclose(a, np.dot(u_gpu.get(), np.dot(np.diag(s_gpu.get()),
12 vh_gpu.get())), 1e-4)
Lev Givon scikit-cuda
Available High-Level Functions
FFT/IFFT
Numerical integration (trapezoidal 1D and 2D)
Linear algebra (dot, eig, qr, svd, transpose, etc.)
Randomized linear algebra (rdmd, rsvd)
Special element-wise functions (expi, sici, etc.)
Numpy-like helper routines not provided by PyCUDA (cumsum,
zeros, etc.)
Lev Givon scikit-cuda
Areas for Future Improvement
Improve maintainability and coverage of exposed functions in
supported libraries by automatically generating ctypes wrappers by
parsing C header files.
Alternately, use cffi (https://bitbucket.org/cffi/cffi) to
generate wrappers.
Provide full support for functions in MAGMA (possibly combined
with a Python-based MAGMA library installer) to enable folks
without access to CULA to run GPU-based LPAPACK.
Enable selection of library backend for functions that are provided by
more than one library (e.g., CULA, MAGMA).
Less naive automated block/grid selection mechanism for functions
that require it.
Lev Givon scikit-cuda
Resources
Code: https://github.com/lebedov/scikit-cuda
Docs: https://scikit-cuda.rtfd.org
Thanks to our long (and growing) list of contributors!
http://scikit-cuda.readthedocs.org/en/latest/authors.html
Lev Givon scikit-cuda

Contenu connexe

Tendances

Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010 Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010 Skills Matter
 
Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmapcmatthieu
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmapwearefractal
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAprithan
 
Parallel K means clustering using CUDA
Parallel K means clustering using CUDAParallel K means clustering using CUDA
Parallel K means clustering using CUDAprithan
 
Nvidia in bioinformatics
Nvidia in bioinformaticsNvidia in bioinformatics
Nvidia in bioinformaticsShanker Trivedi
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaFerdinand Jamitzky
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)PhtRaveller
 
Código para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.pyCódigo para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.pyChema Alonso
 
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JSHome sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JSHyunghun Cho
 
How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1Naoto MATSUMOTO
 
S1170143 2
S1170143 2S1170143 2
S1170143 2s1170143
 
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Hakky St
 
Gstreamer Basics
Gstreamer BasicsGstreamer Basics
Gstreamer Basicsidrajeev
 

Tendances (20)

Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010 Daniel Sikar: Hadoop MapReduce - 06/09/2010
Daniel Sikar: Hadoop MapReduce - 06/09/2010
 
Cuda
CudaCuda
Cuda
 
Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmap
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmap
 
Parallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDAParallel Implementation of K Means Clustering on CUDA
Parallel Implementation of K Means Clustering on CUDA
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012 Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012
 
Parallel K means clustering using CUDA
Parallel K means clustering using CUDAParallel K means clustering using CUDA
Parallel K means clustering using CUDA
 
Nvidia in bioinformatics
Nvidia in bioinformaticsNvidia in bioinformatics
Nvidia in bioinformatics
 
Gpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cudaGpu workshop cluster universe: scripting cuda
Gpu workshop cluster universe: scripting cuda
 
Stack switching for fun and profit
Stack switching for fun and profitStack switching for fun and profit
Stack switching for fun and profit
 
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)Report on GPGPU at FCA  (Lyon, France, 11-15 October, 2010)
Report on GPGPU at FCA (Lyon, France, 11-15 October, 2010)
 
Código para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.pyCódigo para Latch físico: Touch_calibrate.py
Código para Latch físico: Touch_calibrate.py
 
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JSHome sensor prototype on Arduino & Raspberry Pi with Node.JS
Home sensor prototype on Arduino & Raspberry Pi with Node.JS
 
How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1How to burn your GPU with CUDA9.1
How to burn your GPU with CUDA9.1
 
S1170143 2
S1170143 2S1170143 2
S1170143 2
 
IoT Chess
IoT ChessIoT Chess
IoT Chess
 
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
Creating basic workflows as Jupyter Notebooks to use Cytoscape programmatically.
 
Gstreamer Basics
Gstreamer BasicsGstreamer Basics
Gstreamer Basics
 
Illustrator_Sample
Illustrator_SampleIllustrator_Sample
Illustrator_Sample
 

En vedette

Strategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in PythonStrategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in PythonOlivier Grisel
 
Python for High Performance and Scientific Computing
Python for High Performance and Scientific ComputingPython for High Performance and Scientific Computing
Python for High Performance and Scientific ComputingAndreas Schreiber
 
GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)智啓 出川
 
Anaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみたAnaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみたYosuke Onoue
 

En vedette (6)

Strategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in PythonStrategies and Tools for Parallel Machine Learning in Python
Strategies and Tools for Parallel Machine Learning in Python
 
Python for High Performance and Scientific Computing
Python for High Performance and Scientific ComputingPython for High Performance and Scientific Computing
Python for High Performance and Scientific Computing
 
GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)GPGPU Seminar (PyCUDA)
GPGPU Seminar (PyCUDA)
 
Anaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみたAnaconda & NumbaPro 使ってみた
Anaconda & NumbaPro 使ってみた
 
PyCUDAの紹介
PyCUDAの紹介PyCUDAの紹介
PyCUDAの紹介
 
More modern gpu
More modern gpuMore modern gpu
More modern gpu
 

Similaire à scikit-cuda

Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSPeterAndreasEntschev
 
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATORQGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATORNVIDIA Japan
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUShohei Hido
 
Overview of Chainer and Its Features
Overview of Chainer and Its FeaturesOverview of Chainer and Its Features
Overview of Chainer and Its FeaturesSeiya Tokui
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA Japan
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfDLow6
 
Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Preferred Networks
 
LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1Hajime Tazaki
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilersAnastasiaStulova
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_mapslcplcp1
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsKohei KaiGai
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda enKohei KaiGai
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)DongHyeon Kim
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPassPeter Vandenabeele
 
Transparent GPU Exploitation for Java
Transparent GPU Exploitation for JavaTransparent GPU Exploitation for Java
Transparent GPU Exploitation for JavaKazuaki Ishizaki
 

Similaire à scikit-cuda (20)

Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDSDistributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
 
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATORQGATE 0.3: QUANTUM CIRCUIT SIMULATOR
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
 
Arvindsujeeth scaladays12
Arvindsujeeth scaladays12Arvindsujeeth scaladays12
Arvindsujeeth scaladays12
 
Overview of Chainer and Its Features
Overview of Chainer and Its FeaturesOverview of Chainer and Its Features
Overview of Chainer and Its Features
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdfS51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf
 
Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018Introduction to Chainer 11 may,2018
Introduction to Chainer 11 may,2018
 
LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilers
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
Gpu Cuda
Gpu CudaGpu Cuda
Gpu Cuda
 
Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)Kubernetes internals (Kubernetes 해부하기)
Kubernetes internals (Kubernetes 해부하기)
 
Lessons learned with kubernetes in production at PlayPass
Lessons learned with kubernetes in productionat PlayPassLessons learned with kubernetes in productionat PlayPass
Lessons learned with kubernetes in production at PlayPass
 
Transparent GPU Exploitation for Java
Transparent GPU Exploitation for JavaTransparent GPU Exploitation for Java
Transparent GPU Exploitation for Java
 
Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?
 

Dernier

Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 

Dernier (20)

Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 

scikit-cuda

  • 1. scikit-cuda A Python Interface to GPU-Powered Libraries Lev Givon https://www.columbia.edu/~lev Bionet Group Department of Electrical Engineering Columbia University September 24, 2015 Lev Givon scikit-cuda
  • 2. Motivation NVIDIA CUDA contains a (growing) range of great GPU-based libraries (CUBLAS, CUFFT, CURAND, CUSPARSE, CUSOLVER, NPP) in addition to the core driver and runtime libraries. Other libraries (both commercial and free) also available: CULA - http://www.culatools.com/ MAGMA - http://icl.cs.utk.edu/magma/ PyCUDA (http://mathema.tician.de/software/pycuda/) enables Python programmers to take advantage of CUDA, but provides few interfaces to the above libraries. Lev Givon scikit-cuda
  • 3. Package Description Exposes contents of various GPU-based libraries via both low-level and high-level Python functions. Project origin: using CULA’s GPU-based SVD routines to accelerate time decoding algorithms implemented in Python. 1 Low-level functions wrap C functions using ctypes (https://docs.python.org/2/library/ctypes.html); replicate C function signatures (with a few slight differences); catch errors and map error codes to Python exceptions. High-level functions provide a similar interface to functions in scipy (http://scipy.org); use PyCUDA’s GPUArray class to encapsulate matrices/vectors in GPU memory. 1http://www.bionet.ee.columbia.edu/publications Lev Givon scikit-cuda
  • 4. Example 1 import pycuda.gpuarray as gpuarray 2 import pycuda.autoinit 3 import numpy as np 4 import skcuda.linalg as linalg 5 6 linalg.init() 7 a = np.random.randn(9, 6) + 1j*np.random.randn(9, 6) 8 a = np.asarray(a, np.complex64) 9 a_gpu = gpuarray.to_gpu(a) 10 u_gpu, s_gpu, vh_gpu = linalg.svd(a_gpu, ’S’, ’S’) 11 np.allclose(a, np.dot(u_gpu.get(), np.dot(np.diag(s_gpu.get()), 12 vh_gpu.get())), 1e-4) Lev Givon scikit-cuda
  • 5. Available High-Level Functions FFT/IFFT Numerical integration (trapezoidal 1D and 2D) Linear algebra (dot, eig, qr, svd, transpose, etc.) Randomized linear algebra (rdmd, rsvd) Special element-wise functions (expi, sici, etc.) Numpy-like helper routines not provided by PyCUDA (cumsum, zeros, etc.) Lev Givon scikit-cuda
  • 6. Areas for Future Improvement Improve maintainability and coverage of exposed functions in supported libraries by automatically generating ctypes wrappers by parsing C header files. Alternately, use cffi (https://bitbucket.org/cffi/cffi) to generate wrappers. Provide full support for functions in MAGMA (possibly combined with a Python-based MAGMA library installer) to enable folks without access to CULA to run GPU-based LPAPACK. Enable selection of library backend for functions that are provided by more than one library (e.g., CULA, MAGMA). Less naive automated block/grid selection mechanism for functions that require it. Lev Givon scikit-cuda
  • 7. Resources Code: https://github.com/lebedov/scikit-cuda Docs: https://scikit-cuda.rtfd.org Thanks to our long (and growing) list of contributors! http://scikit-cuda.readthedocs.org/en/latest/authors.html Lev Givon scikit-cuda