2. 2
ACCELERATE THE DEEP LEARNING STACK
GPU-Acceleration by NVIDIA and Fast Storage from Alluxio
Apache Arrow
+
3. 3
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
Store
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
Hadoop Processing, Reading from disk
4. 4
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
Hadoop Processing, Reading from disk
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
Storage bottleneck
Spark In-Memory Processing
5. 5
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
Storage still a
bottleneck
GPU/Spark In-Memory Processing
Hadoop Processing, Reading from disk
Spark In-Memory Processing
7. 7
ACCELERATE THE DEEP LEARNING STACK
GPU-Acceleration by NVIDIA and Fast Storage from Alluxio
Apache Arrow
+
8. 8
APP A
GPU-ACCELERATED ARCHITECTURE THEN
Too much data movement and too many different data formats
CPU GPU
APP B
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapD
Copy & Convert
Copy & Convert
Copy & Convert
Load Data
APP A GPU
Data
APP B
GPU
Data
9. 9
GPU-ACCELERATED ARCHITECTURE NOW
Single data format and shared access to data on GPU
CPU GPU
GPU
MEM
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapD Load Data
Apache Arrow
Powered by:
GPU Data Frame
12. 12
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
Alluxio
Read
Query ETL
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
25-100x Improvement
Same code
Language flexible
Primarily on GPU
Alluxio Fast and
Flexible Storage
End to End GPU Processing (GoAi)
GPU/Spark In-Memory Processing
Hadoop Processing, Reading from disk
Spark In-Memory Processing
13. 13
ACCELERATE THE ENTIRE ANALYTICS STACK
GPU-Acceleration by NVIDIA and Fast Storage from Alluxio
Apache Arrow
+
14. 14
DATA & AI ECOSYSTEM EXPLODES
…
• Many Compute
Frameworks
• Many Storage Systems
• Most not co-located
…
15. 15
Data & AI Ecosystem Issues
• Each app manages
multiple data sources
• Data source changes
require global updates
• Storage optimizations
requires app change
• Poor performance due
to lack of locality
…
…
16. 16
Data & AI Ecosystem with Alluxio
• Apps only talk to
Alluxio
• Simple Add/Remove
• No App Changes
• Highest performance
in Memory
Java File API HDFS Interface
Amazon S3
Interface
REST Web Service
HDFS Interface
Amazon S3
Interface
Swift Interface NFS Interface
…
…
17. 17
Storage Challenges for DL& ML
2 Data Freshness
• Cross-network movement is slow
• Copies create lag
• Data quality suffers with copies
4 Security & Governance
• Data security & governance is
increasingly complex
1 Speed & Complexity
• Integration and interoperability issues
(on prem, hybrid, cloud)
• Many departments & groups
3 Cost
• Cloud storage is cheap and reliable, but
slow
• Data duplication
17
Heavy integrations create painful organizational drag
18. 18
Alluxio Design Principles
2 Data Sharing
• Don’t own the data
• Multiple apps sharing common data
• Data stored in multiple, hybrid systems
4 Enterprise Class
• Distributed architecture
• Commodity hardware
• Service-oriented
• High availability
• Security
1 Big Data & Machine Learning
• Interoperability with leading projects
• Large scale data sets
• High IO
3 High Speed Data Access
• Remote data
• Hot/warm/cold data
• Temporary data
• Read/write support
18
22. 22
Alluxio Innovation:
Unified Namespace
Create a catalog of available data sources for Data Scientists
/finance/customer-transactions/
/finance/vendor-transactions/
/operations/device-logs/
/operations/phone-call-recordings/
/operations/check-images/
/research/us-economic-data/
/research/intl-economic-data/
/marketing/advertising-dataset/
/marketing/marketing-funnel-dataset/
alluxio://
24. 24
Deep Learning Input Pipeline
Deep Learning training involves three stages of utilizing different
resources:
• Data reads (I/O): e.g. choose and read image files from source.
• Data Preprocessing (CPU): e.g. decode image records into
images, preprocess, and organize into mini-batches.
• Modeling training (GPU): Calculate and update the parameters
in the multiple convolutional layers
27. 28
Read data in Alluxio, on same node as client
Alluxio
Worker
RAM / SSD / HDD
Memory Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
28. 29
Read data not in Alluxio + Caching
29
RAM / SSD / HDD
Network / Disk Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
WorkerUnder Store
29. 30
Read data in Alluxio, not on same node as
client + Caching
RAM / SSD / HDD
Network Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
Worker
RAM / SSD / HDD
Alluxio
Worker
30. 31
Write data only to Alluxio on same
node as client
Alluxio
Worker
RAM / SSD / HDD
Memory SpeedWrite of Data
Application
Alluxio
Client
Alluxio
Master
31. 32
Write data to Alluxio and Under Store
synchronously
RAM / SSD / HDD
Network / Disk SpeedWrite of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
Worker
Under Store