SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
September 2016 @ Strata & Hadoop World 2016
Haoyuan (HY) Li, Gene Pang
AGENDA
2
•  Alluxio Open Source Status and History

•  Alluxio Overview

•  Alluxio Use Cases and Demos
•  What’s Next?
HISTORY
3
•  Started at UC Berkeley AMPLab In Summer 2012
•  Original named as Tachyon
•  Open Sourced in 2013
•  Apache License 2.0
•  Latest Stable Release: Alluxio 1.2.0
•  Next Release (Alluxio 1.3.0) In Two Weeks
•  Rebranded as Alluxio in 2016
0
50
100
150
200
250
300
350
Year 1 Year 3Year 2
4
OPEN SOURCE ALLUXIO
•  One of the tastest
growing open-
source projects in
the big data
ecosystem
•  Currently over
300 contributors
from over 100
organizations
•  Welcome to join
our community!
Popular Open Source Projects’ Growth
Spark Kafka Cassandra HDFS
Alluxio
About Us
5
•  Team members from Google, Palantir, Uber, Yahoo with
years of distributed systems development experience
•  Graduated from Stanford University, UC Berkeley, CMU,
Peking University, and Tsinghua, with CS masters or PhDs
•  Top 9 committers of the Alluxio open source project
Team
Haoyuan	Li, CEO & Founder

Co-creator of Alluxio project
while working towards Ph.D. at
UC Berkeley AMPLab.
Gene Pang, Software Engineer,
Alluxio Maintainer

Ph.D. from UC Berkeley AMPLab
Previously at Google F1 team
•  Andreessen Horowitz
Investors
BIG DATA ECOSYSTEM TODAY
BIG DATA ECOSYSTEM WITH ALLUXIO
6
BIG DATA ECOSYSTEM YESTERDAY
…
…
FUSE Compatible File System
Hadoop Compatible File System
 Native Key-Value Interface
Native File System
Enabling any application to access data from
any storage system at memory-speed
BIG DATA ECOSYSTEM ISSUES
GlusterFS Interface
Amazon S3 Interface
 Swift Interface
HDFS Interface
•  Memory is getting
Faster, Larger,
and Cheaper
•  Memory price as
halving every 18
months
•  Disk throughput
increasing slowly
7
TECHNOLOGY TRENDS
Top left chart:
https://lazure2.wordpress.com/
2013/07/02/20-years-of-samsung-new-
management-as-manifested-by-the-latest-
june-20th-galaxy-ativ-innovations/

Top right chart:
people.eecs.berkeley.edu/~istoica/classes/
cs294/
15/notes/02-TechnologyTrends.ppt

Bottom chart: jcmit.com/
6.25
12.5
25
18.75
31.25
43.75
37.5
50
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
DDR performance over time
GBs/second
DDR2
DDR4
DDR3
File System API
Software Only
8
ATTRIBUTES
Memory-Speed
 Virtual
 Distributed
 Storage
Scale out
architecture
Virtualized across
different storage
types under a
unified namespace
Memory-speed
access to data
Server A
Applications
Server B
Applications
Server Z
Applications
Server C
Applications
Alluxio
 Alluxio
 Alluxio
Alluxio
9
ALLUXIO SOLUTION DEPLOYMENT
Storage B
 Storage C
 Storage Z
Storage A
10
BENEFITS
Unification
New workflows
across any data
in any storage
system
Performance
High
performance
data access
Flexibility
Work with the
compute and
storage frameworks
of your choice
Cost
Grow compute
and storage
systems
independently
USE CASE 1 – Accelerate I/O to/from Remote
Storage
11
•  Compute and Storage Separation
•  Advantages
•  Meet different compute and storage hardware
requirements efficiently
•  Scale compute and storage independently
•  Store data in Traditional filers/SANs and object
stores cost effectively
•  Compute on data in existing storage via Big Data
Computational frameworks
•  Disadvantage
•  Accessing data requires remote I/O
Use Case without Alluxio
12
Spark
Storage
Low latency, memory
throughput
High latency, network
throughput
Use Case with Alluxio
13
Spark
Storage
Alluxio
Keeping data in Alluxio
accelerates data access
14
CASE STUDY
Baidu File System
The performance was amazing. With
Spark SQL alone, it took 100-150 seconds
to finish a query; using Alluxio, where data
may hit local or remote Alluxio nodes, it
took 10-15 seconds.

- Shaoshan Liu, Baidu
RESULTS
•  Data queries are now 30x faster with Alluxio
•  Alluxio cluster run stably, providing over 50TB
of RAM space
•  By using Alluxio, batch queries usually lasting
over 15 minutes were transformed into an
interactive query taking less than 30 seconds
Accelerate Access to
Remote Storage
•  200+ nodes deployment
•  2+ petabytes of storage
•  Mix of memory + HDD
USE CASE 2 – Share Data Across Jobs at
Memory Speed
15
•  Architectures Requiring Shared Data
•  Pipelines: output of one job is input of the next job
•  Different applications, jobs, or contexts read the
same data
•  Disadvantage
•  Sharing data requires I/O
Use Case without Alluxio
16
Spark
Storage
MapReduce
 Spark
Network I/O
Disk I/O
I/O slows down
sharing
Use Case with Alluxio
17
Spark
Storage
MapReduce
 Spark
Sharing
data in Alluxio
Alluxio
18
CASE STUDY
Thanks to Alluxio, we now have the raw
data immediately available at every
iteration and we can skip the costs of
loading in terms of time waiting, network
traffic, and RDBMS activity.

- Henry Powell, Barclays
RESULTS
•  Barclays workflow iteration time decreased
from hours to seconds
•  Alluxio enabled workflows that were
impossible before
•  By keeping data only in memory, the I/O cost of
loading and storing in Alluxio is now on the
order of seconds
Relational Database
Share Data Across Jobs
at Memory-Speed
•  6 node deployment
•  1TB of storage
•  Memory only
DEMO 1
19
USE CASE 3 - Transparently Manage Data
Across Storage Systems
20
•  Reasons
•  Most enterprises have multiple storage systems
•  New (better, faster, cheaper) storage systems arise
•  Disadvantage
•  Managing data across systems can be difficult
Use Case Explained
21
Storage
Alluxio
Spark
 MapReduce
 Spark
Storage
 Storage
Flexible,
simple
no application
changes,
new mount
point
22
CASE STUDY
We’ve been running Alluxio in production
for over 9 months, resulting in 15x
speedup on average, and 300x speedup at
peak service times.

- Xueyan Li, Qunar
RESULTS
•  Alluxio’s unified namespace enables different
applications and frameworks to easily interact
with their data from different storage systems
•  Improved the performance of their system with
15x – 300x speedups
•  Tiered storage feature manages various
storage resources including memory, SSD and
disk
Transparently Manage Data
Across Different Storage Systems
•  200+ nodes deployment
•  6 billion logs (4.5 TB) daily
•  Mix of Memory + HDD
USE CASE 4 - Compute on Data in Different
Storage with Compliance Requirements
©	2016	Alluxio	 23
•  Motivation
•  Compliance with local laws restricts data storage
location
•  Global Analytics on this data is not possible
Use Case Explained
©	2016	Alluxio	 24
Storage
Alluxio
Spark
 MapReduce
 Spark
Storage
 Storage
Flexible,
simple
no application
changes,
new mount
point
25
CASE STUDY
RESULTS
•  Alluxio’s unified namespace enables
any compute cluster accessing data
from storage systems at different
data centers 
•  Enables global analytics which was
earlier not possible
•  No local persistent storage of data
Compute on Data in Different
Storage with Compliance
Requirement
•  500+ nodes deployment
•  Memory + SSD
A	Global	Fortune	
500	Enterprise
DEMO 2
26
What’s Next?
27
•  Contact: {haoyuan, gene}@alluxio.com or info@alluxio.com 
•  Twitter: @Alluxio
•  Websites: www.alluxio.com and www.alluxio.org
Thank you!

Contenu connexe

Tendances

Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016Alluxio, Inc.
 
The Missing Piece of On-Demand Clusters
The Missing Piece of On-Demand ClustersThe Missing Piece of On-Demand Clusters
The Missing Piece of On-Demand ClustersAlluxio, Inc.
 
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio, Inc.
 
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017Alluxio, Inc.
 
Open Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed StorageOpen Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed StorageAlluxio, Inc.
 
Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3Alluxio, Inc.
 
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016Alluxio, Inc.
 
Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016Jiří Šimša
 
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017Alluxio, Inc.
 
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with SparkBest Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with SparkAlluxio, Inc.
 
Tachyon: An Open Source Memory-Centric Distributed Storage System
Tachyon: An Open Source Memory-Centric Distributed Storage SystemTachyon: An Open Source Memory-Centric Distributed Storage System
Tachyon: An Open Source Memory-Centric Distributed Storage SystemTachyon Nexus, Inc.
 
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017 Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017 Alluxio, Inc.
 
Alluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for DaskAlluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for DaskAlluxio, Inc.
 
Accessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified NamespaceAccessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified NamespaceAlluxio, Inc.
 
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with AlluxioSecurely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with AlluxioAlluxio, Inc.
 
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with SparkBest Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with SparkAlluxio, Inc.
 
Accelerating Spark Workloads in a Mesos Environment with Alluxio
Accelerating Spark Workloads in a Mesos Environment with AlluxioAccelerating Spark Workloads in a Mesos Environment with Alluxio
Accelerating Spark Workloads in a Mesos Environment with AlluxioAlluxio, Inc.
 
Best Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+AlluxioBest Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+AlluxioAlluxio, Inc.
 
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...Alluxio, Inc.
 

Tendances (20)

Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
 
The Missing Piece of On-Demand Clusters
The Missing Piece of On-Demand ClustersThe Missing Piece of On-Demand Clusters
The Missing Piece of On-Demand Clusters
 
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016
 
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017
 
Open Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed StorageOpen Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed Storage
 
Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3Getting Started with Alluxio + Spark + S3
Getting Started with Alluxio + Spark + S3
 
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
 
Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016
 
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
 
Tachyon workshop 2015-07-19
Tachyon workshop 2015-07-19Tachyon workshop 2015-07-19
Tachyon workshop 2015-07-19
 
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with SparkBest Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with Spark
 
Tachyon: An Open Source Memory-Centric Distributed Storage System
Tachyon: An Open Source Memory-Centric Distributed Storage SystemTachyon: An Open Source Memory-Centric Distributed Storage System
Tachyon: An Open Source Memory-Centric Distributed Storage System
 
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017 Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
 
Alluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for DaskAlluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for Dask
 
Accessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified NamespaceAccessing Data Anywhere with Unified Namespace
Accessing Data Anywhere with Unified Namespace
 
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with AlluxioSecurely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
 
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with SparkBest Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with Spark
 
Accelerating Spark Workloads in a Mesos Environment with Alluxio
Accelerating Spark Workloads in a Mesos Environment with AlluxioAccelerating Spark Workloads in a Mesos Environment with Alluxio
Accelerating Spark Workloads in a Mesos Environment with Alluxio
 
Best Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+AlluxioBest Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+Alluxio
 
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...
 

En vedette

Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Alluxio, Inc.
 
Presentation by TachyonNexus & Baidu at Strata Singapore 2015
Presentation by TachyonNexus & Baidu at Strata Singapore 2015Presentation by TachyonNexus & Baidu at Strata Singapore 2015
Presentation by TachyonNexus & Baidu at Strata Singapore 2015Tachyon Nexus, Inc.
 
Strata Hadoop Talk 2016 August
Strata Hadoop Talk 2016 AugustStrata Hadoop Talk 2016 August
Strata Hadoop Talk 2016 AugustClaire Fang
 
Presentation by TachyonNexus & Intel at Strata Singapore 2015
Presentation by TachyonNexus & Intel at Strata Singapore 2015Presentation by TachyonNexus & Intel at Strata Singapore 2015
Presentation by TachyonNexus & Intel at Strata Singapore 2015Tachyon Nexus, Inc.
 
Tachyon Presentation at AMPCamp 6 (November, 2015)
Tachyon Presentation at AMPCamp 6 (November, 2015)Tachyon Presentation at AMPCamp 6 (November, 2015)
Tachyon Presentation at AMPCamp 6 (November, 2015)Tachyon Nexus, Inc.
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkkbajda
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaAlluxio, Inc.
 
Ux och design som konverterar del 2
Ux och design som konverterar del 2Ux och design som konverterar del 2
Ux och design som konverterar del 2Wipcore
 
Dreaded Embedded sec360 5-17-16
Dreaded Embedded   sec360 5-17-16Dreaded Embedded   sec360 5-17-16
Dreaded Embedded sec360 5-17-16Barry Caplin
 

En vedette (10)

Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016
 
Presentation by TachyonNexus & Baidu at Strata Singapore 2015
Presentation by TachyonNexus & Baidu at Strata Singapore 2015Presentation by TachyonNexus & Baidu at Strata Singapore 2015
Presentation by TachyonNexus & Baidu at Strata Singapore 2015
 
Strata Hadoop Talk 2016 August
Strata Hadoop Talk 2016 AugustStrata Hadoop Talk 2016 August
Strata Hadoop Talk 2016 August
 
Presentation by TachyonNexus & Intel at Strata Singapore 2015
Presentation by TachyonNexus & Intel at Strata Singapore 2015Presentation by TachyonNexus & Intel at Strata Singapore 2015
Presentation by TachyonNexus & Intel at Strata Singapore 2015
 
Tachyon Presentation at AMPCamp 6 (November, 2015)
Tachyon Presentation at AMPCamp 6 (November, 2015)Tachyon Presentation at AMPCamp 6 (November, 2015)
Tachyon Presentation at AMPCamp 6 (November, 2015)
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
 
Ux och design som konverterar del 2
Ux och design som konverterar del 2Ux och design som konverterar del 2
Ux och design som konverterar del 2
 
Front cover...
Front cover...Front cover...
Front cover...
 
Dreaded Embedded sec360 5-17-16
Dreaded Embedded   sec360 5-17-16Dreaded Embedded   sec360 5-17-16
Dreaded Embedded sec360 5-17-16
 

Similaire à Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead

Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...Spark Summit
 
Accelerate Cloud Training with Alluxio
Accelerate Cloud Training with AlluxioAccelerate Cloud Training with Alluxio
Accelerate Cloud Training with AlluxioAlluxio, Inc.
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...Data Con LA
 
Unified Big Data Analytics: Any Stack, Any Cloud
Unified Big Data Analytics: Any Stack, Any CloudUnified Big Data Analytics: Any Stack, Any Cloud
Unified Big Data Analytics: Any Stack, Any CloudAlluxio, Inc.
 
Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution Cloudian
 
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsArchitecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsAlluxio, Inc.
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio, Inc.
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Community
 
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...Alluxio, Inc.
 
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioThe Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioAlluxio, Inc.
 
Alluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio, Inc.
 
Webinar: Which Storage Architecture is Best for Splunk Analytics?
Webinar: Which Storage Architecture is Best for Splunk Analytics?Webinar: Which Storage Architecture is Best for Splunk Analytics?
Webinar: Which Storage Architecture is Best for Splunk Analytics?Storage Switzerland
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloadsAlluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloadsAlluxio, Inc.
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio, Inc.
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAlluxio, Inc.
 
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOCloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOStorage Switzerland
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAlluxio, Inc.
 
Accelerating Cloud Training With Alluxio
Accelerating Cloud Training With AlluxioAccelerating Cloud Training With Alluxio
Accelerating Cloud Training With AlluxioAlluxio, Inc.
 
Unify Data at Memory Speed
Unify Data at Memory SpeedUnify Data at Memory Speed
Unify Data at Memory SpeedAlluxio, Inc.
 

Similaire à Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead (20)

Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan...
 
Accelerate Cloud Training with Alluxio
Accelerate Cloud Training with AlluxioAccelerate Cloud Training with Alluxio
Accelerate Cloud Training with Alluxio
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
 
Unified Big Data Analytics: Any Stack, Any Cloud
Unified Big Data Analytics: Any Stack, Any CloudUnified Big Data Analytics: Any Stack, Any Cloud
Unified Big Data Analytics: Any Stack, Any Cloud
 
Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution Building a Hybrid Cloud Solution
Building a Hybrid Cloud Solution
 
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsArchitecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
 
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
Introduction to Alluxio 2.0 Preview | Simplifying data access for cloud workl...
 
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioThe Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with Alluxio
 
Alluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio Use Cases and Future Directions
Alluxio Use Cases and Future Directions
 
Webinar: Which Storage Architecture is Best for Splunk Analytics?
Webinar: Which Storage Architecture is Best for Splunk Analytics?Webinar: Which Storage Architecture is Best for Splunk Analytics?
Webinar: Which Storage Architecture is Best for Splunk Analytics?
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloadsAlluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
 
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & AlluxioAlluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
Alluxio 2.0 & Near Real-time Big Data Platform w/ Spark & Alluxio
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCOCloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
Cloudian Webinar - 7 Key Reasons why Object Storage lowers Storage TCO
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
 
Accelerating Cloud Training With Alluxio
Accelerating Cloud Training With AlluxioAccelerating Cloud Training With Alluxio
Accelerating Cloud Training With Alluxio
 
Unify Data at Memory Speed
Unify Data at Memory SpeedUnify Data at Memory Speed
Unify Data at Memory Speed
 

Plus de Alluxio, Inc.

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioAlluxio, Inc.
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingAlluxio, Inc.
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio, Inc.
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...Alluxio, Inc.
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionAlluxio, Inc.
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeAlluxio, Inc.
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudAlluxio, Inc.
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderAlluxio, Inc.
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionAlluxio, Inc.
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...Alluxio, Inc.
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAlluxio, Inc.
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...Alluxio, Inc.
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...Alluxio, Inc.
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAlluxio, Inc.
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAlluxio, Inc.
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio, Inc.
 

Plus de Alluxio, Inc. (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 

Dernier

How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfIdiosysTechnologies1
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 

Dernier (20)

How to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdfHow to Track Employee Performance A Comprehensive Guide.pdf
How to Track Employee Performance A Comprehensive Guide.pdf
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 

Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead

  • 1. Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead September 2016 @ Strata & Hadoop World 2016 Haoyuan (HY) Li, Gene Pang
  • 2. AGENDA 2 •  Alluxio Open Source Status and History •  Alluxio Overview •  Alluxio Use Cases and Demos •  What’s Next?
  • 3. HISTORY 3 •  Started at UC Berkeley AMPLab In Summer 2012 •  Original named as Tachyon •  Open Sourced in 2013 •  Apache License 2.0 •  Latest Stable Release: Alluxio 1.2.0 •  Next Release (Alluxio 1.3.0) In Two Weeks •  Rebranded as Alluxio in 2016
  • 4. 0 50 100 150 200 250 300 350 Year 1 Year 3Year 2 4 OPEN SOURCE ALLUXIO •  One of the tastest growing open- source projects in the big data ecosystem •  Currently over 300 contributors from over 100 organizations •  Welcome to join our community! Popular Open Source Projects’ Growth Spark Kafka Cassandra HDFS Alluxio
  • 5. About Us 5 •  Team members from Google, Palantir, Uber, Yahoo with years of distributed systems development experience •  Graduated from Stanford University, UC Berkeley, CMU, Peking University, and Tsinghua, with CS masters or PhDs •  Top 9 committers of the Alluxio open source project Team Haoyuan Li, CEO & Founder Co-creator of Alluxio project while working towards Ph.D. at UC Berkeley AMPLab. Gene Pang, Software Engineer, Alluxio Maintainer Ph.D. from UC Berkeley AMPLab Previously at Google F1 team •  Andreessen Horowitz Investors
  • 6. BIG DATA ECOSYSTEM TODAY BIG DATA ECOSYSTEM WITH ALLUXIO 6 BIG DATA ECOSYSTEM YESTERDAY … … FUSE Compatible File System Hadoop Compatible File System Native Key-Value Interface Native File System Enabling any application to access data from any storage system at memory-speed BIG DATA ECOSYSTEM ISSUES GlusterFS Interface Amazon S3 Interface Swift Interface HDFS Interface
  • 7. •  Memory is getting Faster, Larger, and Cheaper •  Memory price as halving every 18 months •  Disk throughput increasing slowly 7 TECHNOLOGY TRENDS Top left chart: https://lazure2.wordpress.com/ 2013/07/02/20-years-of-samsung-new- management-as-manifested-by-the-latest- june-20th-galaxy-ativ-innovations/ Top right chart: people.eecs.berkeley.edu/~istoica/classes/ cs294/ 15/notes/02-TechnologyTrends.ppt Bottom chart: jcmit.com/ 6.25 12.5 25 18.75 31.25 43.75 37.5 50 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 DDR performance over time GBs/second DDR2 DDR4 DDR3
  • 8. File System API Software Only 8 ATTRIBUTES Memory-Speed Virtual Distributed Storage Scale out architecture Virtualized across different storage types under a unified namespace Memory-speed access to data
  • 9. Server A Applications Server B Applications Server Z Applications Server C Applications Alluxio Alluxio Alluxio Alluxio 9 ALLUXIO SOLUTION DEPLOYMENT Storage B Storage C Storage Z Storage A
  • 10. 10 BENEFITS Unification New workflows across any data in any storage system Performance High performance data access Flexibility Work with the compute and storage frameworks of your choice Cost Grow compute and storage systems independently
  • 11. USE CASE 1 – Accelerate I/O to/from Remote Storage 11 •  Compute and Storage Separation •  Advantages •  Meet different compute and storage hardware requirements efficiently •  Scale compute and storage independently •  Store data in Traditional filers/SANs and object stores cost effectively •  Compute on data in existing storage via Big Data Computational frameworks •  Disadvantage •  Accessing data requires remote I/O
  • 12. Use Case without Alluxio 12 Spark Storage Low latency, memory throughput High latency, network throughput
  • 13. Use Case with Alluxio 13 Spark Storage Alluxio Keeping data in Alluxio accelerates data access
  • 14. 14 CASE STUDY Baidu File System The performance was amazing. With Spark SQL alone, it took 100-150 seconds to finish a query; using Alluxio, where data may hit local or remote Alluxio nodes, it took 10-15 seconds. - Shaoshan Liu, Baidu RESULTS •  Data queries are now 30x faster with Alluxio •  Alluxio cluster run stably, providing over 50TB of RAM space •  By using Alluxio, batch queries usually lasting over 15 minutes were transformed into an interactive query taking less than 30 seconds Accelerate Access to Remote Storage •  200+ nodes deployment •  2+ petabytes of storage •  Mix of memory + HDD
  • 15. USE CASE 2 – Share Data Across Jobs at Memory Speed 15 •  Architectures Requiring Shared Data •  Pipelines: output of one job is input of the next job •  Different applications, jobs, or contexts read the same data •  Disadvantage •  Sharing data requires I/O
  • 16. Use Case without Alluxio 16 Spark Storage MapReduce Spark Network I/O Disk I/O I/O slows down sharing
  • 17. Use Case with Alluxio 17 Spark Storage MapReduce Spark Sharing data in Alluxio Alluxio
  • 18. 18 CASE STUDY Thanks to Alluxio, we now have the raw data immediately available at every iteration and we can skip the costs of loading in terms of time waiting, network traffic, and RDBMS activity. - Henry Powell, Barclays RESULTS •  Barclays workflow iteration time decreased from hours to seconds •  Alluxio enabled workflows that were impossible before •  By keeping data only in memory, the I/O cost of loading and storing in Alluxio is now on the order of seconds Relational Database Share Data Across Jobs at Memory-Speed •  6 node deployment •  1TB of storage •  Memory only
  • 20. USE CASE 3 - Transparently Manage Data Across Storage Systems 20 •  Reasons •  Most enterprises have multiple storage systems •  New (better, faster, cheaper) storage systems arise •  Disadvantage •  Managing data across systems can be difficult
  • 21. Use Case Explained 21 Storage Alluxio Spark MapReduce Spark Storage Storage Flexible, simple no application changes, new mount point
  • 22. 22 CASE STUDY We’ve been running Alluxio in production for over 9 months, resulting in 15x speedup on average, and 300x speedup at peak service times. - Xueyan Li, Qunar RESULTS •  Alluxio’s unified namespace enables different applications and frameworks to easily interact with their data from different storage systems •  Improved the performance of their system with 15x – 300x speedups •  Tiered storage feature manages various storage resources including memory, SSD and disk Transparently Manage Data Across Different Storage Systems •  200+ nodes deployment •  6 billion logs (4.5 TB) daily •  Mix of Memory + HDD
  • 23. USE CASE 4 - Compute on Data in Different Storage with Compliance Requirements © 2016 Alluxio 23 •  Motivation •  Compliance with local laws restricts data storage location •  Global Analytics on this data is not possible
  • 24. Use Case Explained © 2016 Alluxio 24 Storage Alluxio Spark MapReduce Spark Storage Storage Flexible, simple no application changes, new mount point
  • 25. 25 CASE STUDY RESULTS •  Alluxio’s unified namespace enables any compute cluster accessing data from storage systems at different data centers •  Enables global analytics which was earlier not possible •  No local persistent storage of data Compute on Data in Different Storage with Compliance Requirement •  500+ nodes deployment •  Memory + SSD A Global Fortune 500 Enterprise
  • 28. •  Contact: {haoyuan, gene}@alluxio.com or info@alluxio.com •  Twitter: @Alluxio •  Websites: www.alluxio.com and www.alluxio.org Thank you!