Flexible and Fast Storage for Deep Learning with Alluxio

Mike Wendt
@mike_wendt
FLEXIBLE AND FAST STORAGE FOR
DEEP LEARNING WITH ALLUXIO
Yupeng Fu

2
ACCELERATE THE DEEP LEARNING STACK
GPU-Acceleration by NVIDIA and Fast Storage from Alluxio
Apache Arrow
+

3
DATA PROCESSING EVOLUTION
Faster Data Access Less Data Movement
Store
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
Hadoop Processing, Reading from disk

4
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
Storage bottleneck
Spark In-Memory Processing

5
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
Storage still a
bottleneck
GPU/Spark In-Memory Processing

7
ACCELERATE THE DEEP LEARNING STACK
Apache Arrow
+

8
APP A
GPU-ACCELERATED ARCHITECTURE THEN
Too much data movement and too many different data formats
CPU GPU
APP B
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapD
Copy & Convert
Copy & Convert
Copy & Convert
Load Data
APP A GPU
Data
APP B
GPU
Data

9
GPU-ACCELERATED ARCHITECTURE NOW
Single data format and shared access to data on GPU
CPU GPU
GPU
MEM
Read DataH2O.ai
Anaconda Gunrock
Graphistry
BlazingDB MapD Load Data
Apache Arrow
Powered by:
GPU Data Frame

10
GRAPH
PROCESSING
ANALYTICS
GPU DATABASES
github.com/gpuopenanalytics
@gpuoai
Apache Arrow
Powered by:

11
GPU ACCELERATION ACROSS THE ECOSYSTEM
Apache Arrow

12
25-100x
Improvement
Less code
Language flexible
Primarily In-Memory
HDFS
Read
HDFS
Write
HDFS
Read
HDFS
Write
HDFS
Read
Query ETL ML Train
HDFS
Read
Query ETL ML Train
HDFS
Read
GPU
Read
Query
CPU
Write
GPU
Read
ETL
CPU
Write
GPU
Read
ML
Train
Alluxio
Read
Query ETL
ML
Train
5-10x Improvement
More code
Language rigid
Substantially on GPU
25-100x Improvement
Same code
Language flexible
Primarily on GPU
Alluxio Fast and
Flexible Storage
End to End GPU Processing (GoAi)
GPU/Spark In-Memory Processing

13
ACCELERATE THE ENTIRE ANALYTICS STACK
Apache Arrow
+

14
DATA & AI ECOSYSTEM EXPLODES
…
• Many Compute
Frameworks
• Many Storage Systems
• Most not co-located
…

15
Data & AI Ecosystem Issues
• Each app manages
multiple data sources
• Data source changes
require global updates
• Storage optimizations
requires app change
• Poor performance due
to lack of locality
…
…

16
Data & AI Ecosystem with Alluxio
• Apps only talk to
Alluxio
• Simple Add/Remove
• No App Changes
• Highest performance
in Memory
Java File API HDFS Interface
Amazon S3
Interface
REST Web Service
HDFS Interface
Amazon S3
Interface
Swift Interface NFS Interface
…
…

17
Storage Challenges for DL& ML
2 Data Freshness
• Cross-network movement is slow
• Copies create lag
• Data quality suffers with copies
4 Security & Governance
• Data security & governance is
increasingly complex
1 Speed & Complexity
• Integration and interoperability issues
(on prem, hybrid, cloud)
• Many departments & groups
3 Cost
• Cloud storage is cheap and reliable, but
slow
• Data duplication
17
Heavy integrations create painful organizational drag

18
Alluxio Design Principles
2 Data Sharing
• Don’t own the data
• Multiple apps sharing common data
• Data stored in multiple, hybrid systems
4 Enterprise Class
• Distributed architecture
• Commodity hardware
• Service-oriented
• High availability
• Security
1 Big Data & Machine Learning
• Interoperability with leading projects
• Large scale data sets
• High IO
3 High Speed Data Access
• Remote data
• Hot/warm/cold data
• Temporary data
• Read/write support
18

19
Alluxio FUSE
Deep Learning
Frameworks
Unified
Data
Storage
Systems
ALLUXIO FUSE

20
Filesystem in Userspace (FUSE)
Running file system code
in user space

21
Alluxio Innovation:
Unified Namespace
Enables effective data management across different Under Stores
Uses Mounting with Transparent Naming

22
Alluxio Innovation:
Unified Namespace
Create a catalog of available data sources for Data Scientists
/finance/customer-transactions/
/finance/vendor-transactions/
/operations/device-logs/
/operations/phone-call-recordings/
/operations/check-images/
/research/us-economic-data/
/research/intl-economic-data/
/marketing/advertising-dataset/
/marketing/marketing-funnel-dataset/
alluxio://

23
Alluxio Innovation:
Intelligent Cache
Local performance from remote data using native multi-tier storage
RAM
SSD
HDD
Hot Warm Cold

24
Deep Learning Input Pipeline
Deep Learning training involves three stages of utilizing different
resources:
• Data reads (I/O): e.g. choose and read image files from source.
• Data Preprocessing (CPU): e.g. decode image records into
images, preprocess, and organize into mini-batches.
• Modeling training (GPU): Calculate and update the parameters
in the multiple convolutional layers

25
Alluxio overcomes I/O bottleneck

26
Alluxio Architecture
Alluxio
Master
Zookeeper
Standby
Master
Alluxio
Worker
Alluxio
Worker
Under Store
Under Store
Alluxio
Client
Application
RAM / SSD / HDD
RAM / SSD / HDD
Control Path
Data Path

28
Read data in Alluxio, on same node as client
Alluxio
Worker
RAM / SSD / HDD
Memory Speed Read of Data
Application
Alluxio
Client
Alluxio
Master

29
Read data not in Alluxio + Caching
29
RAM / SSD / HDD
Network / Disk Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
WorkerUnder Store

30
Read data in Alluxio, not on same node as
client + Caching
RAM / SSD / HDD
Network Speed Read of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
Worker
RAM / SSD / HDD
Alluxio
Worker

31
Write data only to Alluxio on same
node as client
Alluxio
Worker
RAM / SSD / HDD
Memory SpeedWrite of Data
Application
Alluxio
Client
Alluxio
Master

32
Write data to Alluxio and Under Store
synchronously
RAM / SSD / HDD
Network / Disk SpeedWrite of Data
Application
Alluxio
Client
Alluxio
Master
Alluxio
Worker
Under Store

33
100+ known production deployments
AND MORE!

Twitter.com/alluxio
Linkedin.com/alluxio
Website
www.alluxio.com
E-mail
info@alluxio.com
@
Social Media
á
™
34
Thank you!
Yupeng Fu
yupeng@alluxio.com
Github: yupeng9
Twitter.com/alluxio
Linkedin.com/alluxio
Website
www.alluxio.org
E-mail
info@alluxio.com
@
Social Media
á
™

Flexible and Fast Storage for Deep Learning with Alluxio

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Flexible and Fast Storage for Deep Learning with Alluxio

Similaire à Flexible and Fast Storage for Deep Learning with Alluxio (20)

Plus de Alluxio, Inc.

Plus de Alluxio, Inc. (20)

Dernier

Dernier (20)

Flexible and Fast Storage for Deep Learning with Alluxio