Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Alluxio (formerly Tachyon),
A Memory Speed Virtual
Distributed Storage
Haoyuan (HY) Li
CEO @ Alluxio Inc.
April 11, 2016
Alluxio Inc
• Founded by Alluxio (formerly Tachyon) open
source project creators and top committers
• $7.5 million Series ...
Who Am I?
• Haoyuan LI
– Co-creator of Alluxio (formerly Tachyon)
– CEO @ Alluxio Inc
– Ph.D. Candidate @ AMPLab, UC Berke...
Outline
• What is Alluxio
• Why Alluxio
• Alluxio Use Cases
4
Alluxio:
Open Source Memory Speed
Virtual Distributed Storage
5
Memory Speed
• Memory-centric architecture designed for memory I/O
Virtual
• Unified Namespace abstracts persistent storag...
Background
• Started at UC Berkeley AMPLab
– From summer 2012
– The same lab produced Apache Mesos and Apache
Spark
• Open...
Contributor Growth
• Close to 250 Contributors
– 3x growth over the last year
8
Organizations
• Over 50 Organizations
9
Memory Speed Virtual Distributed Storage System
11
What is Alluxio
12
Why Alluxio?
Performance Trend:
Memory is Fast
• RAM throughput
increasing exponentially
• Disk throughput
increasing slowly
13
Memory-...
Price Trend: Memory is Cheaper
14
source: jcmit.com
Realized by many…
15
16
Is the
Problem Solved?
17
Missing a Solution
for the Storage Layer
18
Take a Look at Eco-
System
Big Data Ecosystem
19
Big Data Ecosystem
20
Problems
• Costly Eco-system Integration
• Costly ETL
• Expensive Data Duplication
• Data Silo
• Nightmare Data Management...
Ecosystem
22
Alluxio: Any Application accesses Any
Data from Any Storage at Memory Speed
23
• Enable new workloads across storage syste...
Alluxio Power-Up Your
Workloads
Both in the Cloud and on
Premise
Alluxio Case Study
• Framework: Spark
• Under Storage: Baidu’s File System
• Storage Media: MEM + HDD
• 200+ nodes deploym...
Alluxio Case Study
• Framework: Spark
• Storage Media: MEM
• Improvement from Hours to Seconds
Use Case: Qunar [NASDAQ:QUNR]
• Framework: Spark Streaming & Batch
• Under Storage: HDFS & Ceph
• Storage Media: MEM + HDD...
Use Case: an Oil Company
• Framework: Spark
• Under Storage: GlusterFS
• Storage Media: MEM only
• Analyzing data in tradi...
Use Case: a SAAS Company
• Framework: Impala
• Under Storage: S3
• Storage Media: MEM + SSD
• 15x Performance Improvement
Use Case: a Biotechnology Company
• Framework: Spark & MapReduce
• Under Storage: GlusterFS
• Storage Media: MEM and SSD
Use Case: a SAAS Company
• Framework: Spark
• Under Storage: S3
• Storage Media: SSD only
• Elastic Alluxio deployment
Use Case: a Retail Company
• Framework: Spark & MapReduce
• Under Storage: HDFS
• Storage Media: MEM
• Alluxio Project: www.alluxio.org
• Alluxio Inc: www.alluxio.com
• Development:
www.github.com/Alluxio/alluxio
• Meet Fri...
A Virtual Distributed Storage System // Haoyuan Li, Alluxio [FirstMark's Data Driven]
Prochain SlideShare
Chargement dans…5
×

A Virtual Distributed Storage System // Haoyuan Li, Alluxio [FirstMark's Data Driven]

383 vues

Publié le

Haoyuan Li, Founder and CEO at Alluxio, presented at FirstMark's Data Driven NYC on April 11, 2016. Li discussed the challenges of accessing data from any storage at memory speed.

Alluxio's memory-centric distributed storage system bridges applications and underlying storage systems providing unified data access orders of magnitudes faster than existing solutions.

Data Driven NYC is a monthly event covering Big Data and data-driven products and startups, hosted by Matt Turck, partner at FirstMark.

FirstMark is an early stage venture capital firm based in New York City. Find out more about Data Driven NYC at http://datadrivennyc.com and FirstMark Capital at http://firstmarkcap.com.

Publié dans : Technologie
  • We bought our daughter's first car from one of the auctions listed by Gov-Auctions.org. Thanks for a great service. ▶▶▶ https://w.url.cn/s/Aaxmqpl
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Your customer service is one of the best experiences I have had. Thanks again. ☞☞☞ https://w.url.cn/s/AFqTUhi
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Soyez le premier à aimer ceci

A Virtual Distributed Storage System // Haoyuan Li, Alluxio [FirstMark's Data Driven]

  1. 1. Alluxio (formerly Tachyon), A Memory Speed Virtual Distributed Storage Haoyuan (HY) Li CEO @ Alluxio Inc. April 11, 2016
  2. 2. Alluxio Inc • Founded by Alluxio (formerly Tachyon) open source project creators and top committers • $7.5 million Series A by Andreessen Horowitz • Committed to the Alluxio Open Source Project • Company Website: www.alluxio.com • We are hiring! 2
  3. 3. Who Am I? • Haoyuan LI – Co-creator of Alluxio (formerly Tachyon) – CEO @ Alluxio Inc – Ph.D. Candidate @ AMPLab, UC Berkeley – Founding Committer of Apache Spark
  4. 4. Outline • What is Alluxio • Why Alluxio • Alluxio Use Cases 4
  5. 5. Alluxio: Open Source Memory Speed Virtual Distributed Storage 5
  6. 6. Memory Speed • Memory-centric architecture designed for memory I/O Virtual • Unified Namespace abstracts persistent storage from applications Distributed • Designed to scale with nothing but commodity hardware Open Source • One of the fastest growing project communities 6
  7. 7. Background • Started at UC Berkeley AMPLab – From summer 2012 – The same lab produced Apache Mesos and Apache Spark • Open sourced – April 2013 – Apache License 2.0 – Latest Release: Version 1.0.1 (February 2016) • Deployed at > 100 companies 7
  8. 8. Contributor Growth • Close to 250 Contributors – 3x growth over the last year 8
  9. 9. Organizations • Over 50 Organizations 9
  10. 10. Memory Speed Virtual Distributed Storage System 11 What is Alluxio
  11. 11. 12 Why Alluxio?
  12. 12. Performance Trend: Memory is Fast • RAM throughput increasing exponentially • Disk throughput increasing slowly 13 Memory-locality key to interactive response times
  13. 13. Price Trend: Memory is Cheaper 14 source: jcmit.com
  14. 14. Realized by many… 15
  15. 15. 16 Is the Problem Solved?
  16. 16. 17 Missing a Solution for the Storage Layer
  17. 17. 18 Take a Look at Eco- System
  18. 18. Big Data Ecosystem 19
  19. 19. Big Data Ecosystem 20
  20. 20. Problems • Costly Eco-system Integration • Costly ETL • Expensive Data Duplication • Data Silo • Nightmare Data Management • Long Cycle from Data to Value
  21. 21. Ecosystem 22
  22. 22. Alluxio: Any Application accesses Any Data from Any Storage at Memory Speed 23 • Enable new workloads across storage systems • Work with the framework of your choice • Scale storage and compute independently
  23. 23. Alluxio Power-Up Your Workloads Both in the Cloud and on Premise
  24. 24. Alluxio Case Study • Framework: Spark • Under Storage: Baidu’s File System • Storage Media: MEM + HDD • 200+ nodes deployment • 2PB+ managed space
  25. 25. Alluxio Case Study • Framework: Spark • Storage Media: MEM • Improvement from Hours to Seconds
  26. 26. Use Case: Qunar [NASDAQ:QUNR] • Framework: Spark Streaming & Batch • Under Storage: HDFS & Ceph • Storage Media: MEM + HDD • 200 nodes deployment
  27. 27. Use Case: an Oil Company • Framework: Spark • Under Storage: GlusterFS • Storage Media: MEM only • Analyzing data in traditional storage
  28. 28. Use Case: a SAAS Company • Framework: Impala • Under Storage: S3 • Storage Media: MEM + SSD • 15x Performance Improvement
  29. 29. Use Case: a Biotechnology Company • Framework: Spark & MapReduce • Under Storage: GlusterFS • Storage Media: MEM and SSD
  30. 30. Use Case: a SAAS Company • Framework: Spark • Under Storage: S3 • Storage Media: SSD only • Elastic Alluxio deployment
  31. 31. Use Case: a Retail Company • Framework: Spark & MapReduce • Under Storage: HDFS • Storage Media: MEM
  32. 32. • Alluxio Project: www.alluxio.org • Alluxio Inc: www.alluxio.com • Development: www.github.com/Alluxio/alluxio • Meet Friends: www.meetup.com/Alluxio • Contact: haoyuan@alluxio.com 33

×