SlideShare une entreprise Scribd logo
1  sur  52
Télécharger pour lire hors ligne
High Performance GPU
Computing with Ruby
Prasun Anand
About me
● Modak Analytics
● Genenetwork project
● SciRuby Contributor
● Google Summer of Code 2016, 2017
● Ruby Grant 2017
● Fukuoka Ruby Award 2018
● Projects:
○ JRuby port of NMatrix
○ ArrayFire gem
○ RbCUDA
Data is the new Oil!
Highlights
Modak Analytics is helping implement one of the largest Life Sciences
Platform in the world.
Platform Details
2100
Structured
data sources
500k
Tables
1350
Unstructured
sources
1.3
Billion
Files
1200
Data Nodes
6
Petabytes
Usable
information
• 1000+ clinical trials being standardized to
CDISC (SDTM) model for cross-study analysis,
placebo baseline etc.
• Single integrated data platform comprising of
compound, activity results, assay protocol and
project information
• “Like Minded” data has been grounded into
Data Domains by business areas. E.g. Clinical,
Assay, Gene, Regulatory etc
• Around 17+ solutions have been developed
and deployed for business
Awarded at the prestigious ‘Strata
Data Conference 2017’ for building
this platform in record time
Governed Data Lake approach
AUTOMATED
DATA DISCOVERY
• Modak is
providing
end-to-end
service for the
platform
including
Automated
Ingestion,
Curation, and
innovative
Solutions
• Modak is also
providing 24*7
support for the
massive platform
AUTOMATED
DATA INGESTION
Data Spider
Postgres
SQL serverOracle
MySQL
Structured Data
SAS Data Sets
Unstructured Data
File shares
SharePointDocumentum
BOTS
FOUNDATION
LAYER
Ingested
Raw Data
Data Tagging
Data Masking
Data
cleansing
Data lineage
Data profiling
Augmented Data
Mapping/
Standardization
Data
Fingerprinting
A replica of the
Data is
ingested into
the Integration
Layer
INTEGRATION
LAYER
SOLUTIONS
LAYER
Data Analytics
SEMANTICLAYER
Visulaisation
Dashboards
and Reports
MetaData
Catalog
(KOSH)
Flow
controller
Streamsets
Pipelines are
generated
automatically
Data Governance
Data Security
System / Application Management
SOURCE DATA
Originators of data and serve
as “authoring” systems to
support business processes
Optimized for computing and
distribution of data Optimized for strategic BI
product development
Optimized for
Business Users
Optimized for
Analysts, Data
scientists
GWAS
Genome Wide Association
Studies(GWAS)
Matrix Multiplication ?
Arrays / Matrices
BLAS and LAPACK
GPU Computing is not easy !
CUDA and OpenCL
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
[2] pry(main)> b = a + a
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
2.0000 6.0000
4.0000 8.0000
=> #<ArrayFire::Af_Array:0x000000020625c8>
[1] pry(main)> left = ArrayFire::Af_Array.new 2 , [3,3] , [1, 4, 6, 4, 11 , 2 ,-5, 8, 10]
No Name Array
[3 3 1 1]
1.0000 4.0000 -5.0000
4.0000 11.0000 8.0000
6.0000 2.0000 10.0000
=> #<ArrayFire::Af_Array:0x000000014e56c8>
[2] pry(main)> right = ArrayFire::Af_Array.new 2 , [3,2] , [1, 0, 8, 10, -11, 8]
No Name Array
[3 2 1 1]
1.0000 10.0000
0.0000 -11.0000
8.0000 8.0000
=> #<ArrayFire::Af_Array:0x00000001591db0>
[3] pry(main)> result = ArrayFire::BLAS.matmul(left, right, :AF_MAT_NONE, :AF_MAT_NONE)
No Name Array
[3 2 1 1]
-39.0000 -74.0000
68.0000 -17.0000
86.0000 118.0000
=> #<ArrayFire::Af_Array:0x000000016136f8>
VALUE arf_init(int argc, VALUE* argv, VALUE self)
{
afstruct* afarray;
Data_Get_Struct(self, afstruct, afarray);
dim_t ndims = (dim_t)NUM2LONG(argv[0]);
dim_t* dimensions = (dim_t*)malloc(ndims * sizeof(dim_t));
dim_t count = 1;
for (size_t index = 0; index < ndims; index++) {
dimensions[index] = (dim_t)NUM2LONG(RARRAY_AREF(argv[1], index));
count *= dimensions[index];
}
double* host_array = (double*)malloc(count * sizeof(double));
for (size_t index = 0; index < count; index++) {
host_array[index] = (double)NUM2DBL(RARRAY_AREF(argv[2], index));
}
af_create_array(&afarray->carray, host_array, ndims, dimensions, f64);
return self;
}
static VALUE arf_matmul(VALUE self, VALUE left_val, VALUE right_val, VALUE left_prop_val, VALUE
right_prop_val){
afstruct* left;
afstruct* right;
afstruct* result = ALLOC(afstruct);
Data_Get_Struct(left_val, afstruct, left);
Data_Get_Struct(right_val, afstruct, right);
af_mat_prop left_mat_prop = arf_mat_type_from_rbsymbol(left_prop_val);
af_mat_prop right_mat_prop = arf_mat_type_from_rbsymbol(right_prop_val);
af_matmul(&result->carray, left->carray, right->carray, left_mat_prop, right_mat_prop);
return Data_Wrap_Struct(CLASS_OF(left_val), NULL, arf_free, result);
}
BLAS functionalities
● Matmult
● Transpose
LAPACK functionalities
● Det
● Inverse
● Norm
● Qr
● Cholesky
● Svd
● lu
Statistics
● Mean
● Median
● Variance
Benchmarks
● AMD FX 8350 octacore processor
● Nvidia GTX 750Ti GPU
● Double dtype
10 X
Faster than NMatrix-Ruby-Lapack
10,000 X
Faster than NMatrix-Ruby
100,000 X
Faster than NMatrix-Ruby-BLAS
RbCUDA
GPU Array
● Generic pointer used to handle an array of elements on the GPU.
● Memory copying from CPU to GPU and vice-versa.
● Interfaced with NMatrix and NArray
vadd_kernel_src = <<-EOS
extern "C" {
__global__ void matSum(int *a, int *b, int *c)
{
int tid = blockIdx.x;
if (tid < 100)
c[tid] = a[tid] + b[tid];
}
}
EOS
f = compile(vadd_kernel_src)
RbCUDA::Driver.run_kernel(f.path)
● CuBLAS
● CuSolver
● CuRand
Benchmarks
● AMD FX 8350 octacore processor
● Nvidia GTX 750Ti GPU
● Double dtype
1,000,000 X
Faster than NMatrix-Ruby-BLAS
Fastest Matrix Multiplication
Library in Ruby!
Future Work
● Image Processing APIs and Indexers
● Multiple dtypes
● RbCUDA is under development.
● https://github.com/arrayfire/arrayfire-rb
● https://github.com/prasunanand/rbcuda
Contributions are Welcome!
Acknowledgements
1. Pjotr Prins
2. Pradeep Garigipati
3. Kenta Murata
4. Ruby Science Foundation
5. Ruby Association
6. Modak Analytics
Thanks!
Github: prasunanand
Twitter: @prasun_anand
Blog: prasunanand.com
Questions

Contenu connexe

Tendances

Javascript Arrays
Javascript ArraysJavascript Arrays
Javascript Arraysshaheenakv
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practiceJano Suchal
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query OptimizationMongoDB
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)MongoDB
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27MongoDB
 
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...Ontico
 
The Ring programming language version 1.10 book - Part 39 of 212
The Ring programming language version 1.10 book - Part 39 of 212The Ring programming language version 1.10 book - Part 39 of 212
The Ring programming language version 1.10 book - Part 39 of 212Mahmoud Samir Fayed
 
Postgres rules
Postgres rulesPostgres rules
Postgres rulesgisborne
 
BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7Georgi Kodinov
 
Error based blind sqli
Error based blind sqliError based blind sqli
Error based blind sqliDarkZtone Zone
 
The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31Mahmoud Samir Fayed
 
Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)Stephen Chin
 
Data Munging in R - Chicago R User Group
Data Munging in R - Chicago R User GroupData Munging in R - Chicago R User Group
Data Munging in R - Chicago R User Groupdesignandanalytics
 
The Ring programming language version 1.5.2 book - Part 29 of 181
The Ring programming language version 1.5.2 book - Part 29 of 181The Ring programming language version 1.5.2 book - Part 29 of 181
The Ring programming language version 1.5.2 book - Part 29 of 181Mahmoud Samir Fayed
 
Python seaborn cheat_sheet
Python seaborn cheat_sheetPython seaborn cheat_sheet
Python seaborn cheat_sheetNishant Upadhyay
 
The Ring programming language version 1.5.1 book - Part 43 of 180
The Ring programming language version 1.5.1 book - Part 43 of 180The Ring programming language version 1.5.1 book - Part 43 of 180
The Ring programming language version 1.5.1 book - Part 43 of 180Mahmoud Samir Fayed
 

Tendances (20)

Php forum2015 tomas_final
Php forum2015 tomas_finalPhp forum2015 tomas_final
Php forum2015 tomas_final
 
Javascript Arrays
Javascript ArraysJavascript Arrays
Javascript Arrays
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practice
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
 
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
 
The Ring programming language version 1.10 book - Part 39 of 212
The Ring programming language version 1.10 book - Part 39 of 212The Ring programming language version 1.10 book - Part 39 of 212
The Ring programming language version 1.10 book - Part 39 of 212
 
Postgres rules
Postgres rulesPostgres rules
Postgres rules
 
BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7
 
Error based blind sqli
Error based blind sqliError based blind sqli
Error based blind sqli
 
The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31
 
Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)
 
Data Munging in R - Chicago R User Group
Data Munging in R - Chicago R User GroupData Munging in R - Chicago R User Group
Data Munging in R - Chicago R User Group
 
The Ring programming language version 1.5.2 book - Part 29 of 181
The Ring programming language version 1.5.2 book - Part 29 of 181The Ring programming language version 1.5.2 book - Part 29 of 181
The Ring programming language version 1.5.2 book - Part 29 of 181
 
Python seaborn cheat_sheet
Python seaborn cheat_sheetPython seaborn cheat_sheet
Python seaborn cheat_sheet
 
Rug hogan-10-03-2012
Rug hogan-10-03-2012Rug hogan-10-03-2012
Rug hogan-10-03-2012
 
The Ring programming language version 1.5.1 book - Part 43 of 180
The Ring programming language version 1.5.1 book - Part 43 of 180The Ring programming language version 1.5.1 book - Part 43 of 180
The Ring programming language version 1.5.1 book - Part 43 of 180
 
What are arrays in java script
What are arrays in java scriptWhat are arrays in java script
What are arrays in java script
 
Xm lparsers
Xm lparsersXm lparsers
Xm lparsers
 

Similaire à High Performance GPU computing with Ruby, Rubykaigi 2018

Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyPrasun Anand
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...Andrew Lamb
 
New SQL features in latest MySQL releases
New SQL features in latest MySQL releasesNew SQL features in latest MySQL releases
New SQL features in latest MySQL releasesGeorgi Sotirov
 
Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Mark Smith
 
Coscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usageCoscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usageWayne Tsai
 
20180420 hk-the powerofmysql8
20180420 hk-the powerofmysql820180420 hk-the powerofmysql8
20180420 hk-the powerofmysql8Ivan Ma
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with ClojureDmitry Buzdin
 
SP-First-Lecture.ppt
SP-First-Lecture.pptSP-First-Lecture.ppt
SP-First-Lecture.pptFareedIhsas
 
High performance GPU computing with Ruby
High performance GPU computing with RubyHigh performance GPU computing with Ruby
High performance GPU computing with RubyPrasun Anand
 
Develop Python Applications with MySQL Connector/Python
Develop Python Applications with MySQL Connector/PythonDevelop Python Applications with MySQL Connector/Python
Develop Python Applications with MySQL Connector/PythonJesper Wisborg Krogh
 
PHP security audits
PHP security auditsPHP security audits
PHP security auditsDamien Seguy
 
Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1Patrycja Wegrzynowicz
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]RootedCON
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersJonathan Levin
 
Getting Functional with Scala
Getting Functional with ScalaGetting Functional with Scala
Getting Functional with ScalaJorge Paez
 

Similaire à High Performance GPU computing with Ruby, Rubykaigi 2018 (20)

Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for Ruby
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
 
New SQL features in latest MySQL releases
New SQL features in latest MySQL releasesNew SQL features in latest MySQL releases
New SQL features in latest MySQL releases
 
Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016
 
Coscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usageCoscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usage
 
Unit 2 dsa LINEAR DATA STRUCTURE
Unit 2 dsa LINEAR DATA STRUCTUREUnit 2 dsa LINEAR DATA STRUCTURE
Unit 2 dsa LINEAR DATA STRUCTURE
 
20180420 hk-the powerofmysql8
20180420 hk-the powerofmysql820180420 hk-the powerofmysql8
20180420 hk-the powerofmysql8
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
Chp4(ref dynamic)
Chp4(ref dynamic)Chp4(ref dynamic)
Chp4(ref dynamic)
 
SP-First-Lecture.ppt
SP-First-Lecture.pptSP-First-Lecture.ppt
SP-First-Lecture.ppt
 
High performance GPU computing with Ruby
High performance GPU computing with RubyHigh performance GPU computing with Ruby
High performance GPU computing with Ruby
 
database.pptx
database.pptxdatabase.pptx
database.pptx
 
Develop Python Applications with MySQL Connector/Python
Develop Python Applications with MySQL Connector/PythonDevelop Python Applications with MySQL Connector/Python
Develop Python Applications with MySQL Connector/Python
 
PHP security audits
PHP security auditsPHP security audits
PHP security audits
 
Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Arrays 06.ppt
Arrays 06.pptArrays 06.ppt
Arrays 06.ppt
 
arrays
arraysarrays
arrays
 
Getting Functional with Scala
Getting Functional with ScalaGetting Functional with Scala
Getting Functional with Scala
 

Dernier

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Dernier (20)

DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

High Performance GPU computing with Ruby, Rubykaigi 2018

  • 1. High Performance GPU Computing with Ruby Prasun Anand
  • 2.
  • 3. About me ● Modak Analytics ● Genenetwork project ● SciRuby Contributor ● Google Summer of Code 2016, 2017 ● Ruby Grant 2017 ● Fukuoka Ruby Award 2018 ● Projects: ○ JRuby port of NMatrix ○ ArrayFire gem ○ RbCUDA
  • 4. Data is the new Oil!
  • 5. Highlights Modak Analytics is helping implement one of the largest Life Sciences Platform in the world. Platform Details 2100 Structured data sources 500k Tables 1350 Unstructured sources 1.3 Billion Files 1200 Data Nodes 6 Petabytes Usable information • 1000+ clinical trials being standardized to CDISC (SDTM) model for cross-study analysis, placebo baseline etc. • Single integrated data platform comprising of compound, activity results, assay protocol and project information • “Like Minded” data has been grounded into Data Domains by business areas. E.g. Clinical, Assay, Gene, Regulatory etc • Around 17+ solutions have been developed and deployed for business Awarded at the prestigious ‘Strata Data Conference 2017’ for building this platform in record time
  • 6. Governed Data Lake approach AUTOMATED DATA DISCOVERY • Modak is providing end-to-end service for the platform including Automated Ingestion, Curation, and innovative Solutions • Modak is also providing 24*7 support for the massive platform AUTOMATED DATA INGESTION Data Spider Postgres SQL serverOracle MySQL Structured Data SAS Data Sets Unstructured Data File shares SharePointDocumentum BOTS FOUNDATION LAYER Ingested Raw Data Data Tagging Data Masking Data cleansing Data lineage Data profiling Augmented Data Mapping/ Standardization Data Fingerprinting A replica of the Data is ingested into the Integration Layer INTEGRATION LAYER SOLUTIONS LAYER Data Analytics SEMANTICLAYER Visulaisation Dashboards and Reports MetaData Catalog (KOSH) Flow controller Streamsets Pipelines are generated automatically Data Governance Data Security System / Application Management SOURCE DATA Originators of data and serve as “authoring” systems to support business processes Optimized for computing and distribution of data Optimized for strategic BI product development Optimized for Business Users Optimized for Analysts, Data scientists GWAS
  • 11. GPU Computing is not easy !
  • 13.
  • 14. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 15. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 16. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 17. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 18. [2] pry(main)> b = a + a No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 2.0000 6.0000 4.0000 8.0000 => #<ArrayFire::Af_Array:0x000000020625c8>
  • 19. [1] pry(main)> left = ArrayFire::Af_Array.new 2 , [3,3] , [1, 4, 6, 4, 11 , 2 ,-5, 8, 10] No Name Array [3 3 1 1] 1.0000 4.0000 -5.0000 4.0000 11.0000 8.0000 6.0000 2.0000 10.0000 => #<ArrayFire::Af_Array:0x000000014e56c8> [2] pry(main)> right = ArrayFire::Af_Array.new 2 , [3,2] , [1, 0, 8, 10, -11, 8] No Name Array [3 2 1 1] 1.0000 10.0000 0.0000 -11.0000 8.0000 8.0000 => #<ArrayFire::Af_Array:0x00000001591db0>
  • 20. [3] pry(main)> result = ArrayFire::BLAS.matmul(left, right, :AF_MAT_NONE, :AF_MAT_NONE) No Name Array [3 2 1 1] -39.0000 -74.0000 68.0000 -17.0000 86.0000 118.0000 => #<ArrayFire::Af_Array:0x000000016136f8>
  • 21. VALUE arf_init(int argc, VALUE* argv, VALUE self) { afstruct* afarray; Data_Get_Struct(self, afstruct, afarray); dim_t ndims = (dim_t)NUM2LONG(argv[0]); dim_t* dimensions = (dim_t*)malloc(ndims * sizeof(dim_t)); dim_t count = 1; for (size_t index = 0; index < ndims; index++) { dimensions[index] = (dim_t)NUM2LONG(RARRAY_AREF(argv[1], index)); count *= dimensions[index]; } double* host_array = (double*)malloc(count * sizeof(double)); for (size_t index = 0; index < count; index++) { host_array[index] = (double)NUM2DBL(RARRAY_AREF(argv[2], index)); } af_create_array(&afarray->carray, host_array, ndims, dimensions, f64); return self; }
  • 22. static VALUE arf_matmul(VALUE self, VALUE left_val, VALUE right_val, VALUE left_prop_val, VALUE right_prop_val){ afstruct* left; afstruct* right; afstruct* result = ALLOC(afstruct); Data_Get_Struct(left_val, afstruct, left); Data_Get_Struct(right_val, afstruct, right); af_mat_prop left_mat_prop = arf_mat_type_from_rbsymbol(left_prop_val); af_mat_prop right_mat_prop = arf_mat_type_from_rbsymbol(right_prop_val); af_matmul(&result->carray, left->carray, right->carray, left_mat_prop, right_mat_prop); return Data_Wrap_Struct(CLASS_OF(left_val), NULL, arf_free, result); }
  • 23. BLAS functionalities ● Matmult ● Transpose LAPACK functionalities ● Det ● Inverse ● Norm ● Qr ● Cholesky ● Svd ● lu
  • 25. Benchmarks ● AMD FX 8350 octacore processor ● Nvidia GTX 750Ti GPU ● Double dtype
  • 26.
  • 27. 10 X Faster than NMatrix-Ruby-Lapack
  • 28.
  • 29.
  • 30.
  • 31. 10,000 X Faster than NMatrix-Ruby
  • 32.
  • 33.
  • 34.
  • 35. 100,000 X Faster than NMatrix-Ruby-BLAS
  • 36.
  • 37.
  • 39. GPU Array ● Generic pointer used to handle an array of elements on the GPU. ● Memory copying from CPU to GPU and vice-versa. ● Interfaced with NMatrix and NArray
  • 40. vadd_kernel_src = <<-EOS extern "C" { __global__ void matSum(int *a, int *b, int *c) { int tid = blockIdx.x; if (tid < 100) c[tid] = a[tid] + b[tid]; } } EOS f = compile(vadd_kernel_src) RbCUDA::Driver.run_kernel(f.path)
  • 42. Benchmarks ● AMD FX 8350 octacore processor ● Nvidia GTX 750Ti GPU ● Double dtype
  • 43.
  • 44. 1,000,000 X Faster than NMatrix-Ruby-BLAS
  • 45.
  • 47.
  • 48. Future Work ● Image Processing APIs and Indexers ● Multiple dtypes ● RbCUDA is under development.
  • 50. Acknowledgements 1. Pjotr Prins 2. Pradeep Garigipati 3. Kenta Murata 4. Ruby Science Foundation 5. Ruby Association 6. Modak Analytics