Alluxio Tech Talk
Jul 17, 2019
Speakers:
Brien Porter, Intel
Alex Ma, Alluxio
The ever increasing challenge to process and extract value from exploding data with AI and analytics workloads makes a memory centric architecture with disaggregated storage and compute more attractive. This decoupled architecture enables users to innovate faster and scale on-demand. Enterprises are also increasingly looking towards object stores to power their big data & machine learning workloads in a cost-effective way. However, object stores don’t provide big data compatible APIs as well as the required performance.
In this webinar, the Intel and Alluxio teams will present a proposed reference architecture using Alluxio as the in-memory accelerator for object stores to enable modern analytical workloads such as Spark, Presto, Tensorflow, and Hive. We will also present a technical overview of Alluxio.
23. Use Cases Alluxio Enables
Burst big data workloads in
hybrid cloud environments
Same instance
/ container
Accelerate big data frameworks
on the public cloud
Same instance
/ container
Dramatically speed-up big data
on object stores on premise
Same container
/ machine
or or
Alluxio
Presto
Alluxio
Presto
Alluxio
Presto
Alluxio
PrestoHive
Alluxio
Hive
Alluxio
Hive
Alluxio
Hive
Alluxio
Alluxio
Spark
AlluxioAlluxio
Spark
Alluxio
SparkSpark
24. Data Elasticity
with a unified
namespace
Abstract data silos & storage
systems to independently scale
data on-demand with compute
Run Spark, Hive, Presto, ML
workloads on your data
located anywhere
Accelerate big data
workloads with transparent
tiered local data
Data Accessibility
for popular APIs &
API translation
Data Locality
with Intelligent
Multi-tiering
Alluxio – Key innovations
25. Data Locality with Intelligent Multi-tiering
Local performance from remote data using multi-tier storage
Hot Warm Cold
RAM SSD HDD
Read & Write Buffering
Transparent to App
Policies for pinning,
promotion/demotion,TTL
26. Data Accessibility via popular APIs and API Translation
Convert from Client-side Interface to native Storage Interface
Java File API HDFS Interface S3 Interface REST APIPOSIX Interface
HDFS Driver Swift DriverS3 Driver NFS Driver
27. Data Elasticity via Unified Namespace
Enables effective data management across different Under Store
- Uses Mounting withTransparent Naming
30. Incredible Open Source Momentum with growing community
1000+ contributors &
growing
4000+ Git Stars
Apache 2.0 Licensed
Hundreds of thousands
of downloads
Join the conversation on Slack
alluxio.org/slack