SlideShare une entreprise Scribd logo
1  sur  56
Vendor Speech by  Raghu Ramakrishnan Cloud Computing Conference & Expo November 3, 2009   Key Challenges in Cloud Computing  … and the Yahoo! Approach
Key Challenges ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Inside Yahoo!’s Cloud KEY CHALLENGES
DATA MANAGEMENT  IN THE CLOUD
Help! ,[object Object],DB2 UDB Sherpa
What Are You Trying to Do? Data Workloads OLTP (Random access to a few records) OLAP (Scan access to a large number of records) Read-heavy Write-heavy By rows By columns Unstructured Combined (Some OLTP and OLAP tasks)
Yahoo! Solution Space OLTP (Random access to a few records) OLAP (Scan access to a large number of records) Read-heavy Write-heavy By rows By columns Unstructured Combined (Some OLTP and OLAP tasks) UDS UDB ??? STCache Sherpa Read, read/write Write-heavy Main-memory SQL on Grid Zebra Pig HDFS MapReduce
Storage ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Yahoo! Storage Problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Typical Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
VLSD Data Serving Stores ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The CAP Theorem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Approaches to CAP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
One Slide Hadoop Primer HDFS Data file Map tasks HDFS Reduce tasks Good for analyzing (scanning) huge files Not great for serving (reading or writing individual objects)
Out There in the World ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],HadoopDB ??? * Memcached provides no native query processing DB2
Ways of Using Hadoop Data workloads OLAP (Scan access to a large number of records) By rows By columns Unstructured HadoopDB SQL on Grid Zebra
Hadoop-Based Applications @ Yahoo! 2008 2009 Webmap ~70 hours runtime ~300 TB shuffling ~200 TB output ~73 hours runtime ~490 TB shuffling ~280 TB output +55% hardware Terasort 209 seconds 1 Terabyte sorted 900 nodes 62 seconds   1 Terabyte, 1500 nodes 16.25 hours   1 Petabyte, 3700 nodes Largest cluster ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Batch Storage and Processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
SHERPA * Yahoo!’s Serving Store * The system also known as PNUTS
What is Sherpa? CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Structured, flexible schemas Hashed and ordered tables Hosted, managed infrastructure E  75656  C A  42342  E B  42521  W C  66354  W D  12352  E F  15677  E E  75656  C A  42342  E B  42521  W C  66354  W D  12352  E F  15677  E A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E
Key Components ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Architecture Storage units Routers Tablet Controller REST API Clients Local region Remote  regions Tribble
Updates Write key k Sequence # for key k Sequence # for key k Write key k SUCCESS Write key k Routers Message brokers 1 2 Write key k 7 8 SU SU SU 3 4 5 6
Accessing Data SU SU SU Get key k 1 2 Get key k 3 Record for key k 4 Record for key k
ASYNCHRONOUS REPLICATION AND CONSISTENCY
Asynchronous Replication
Consistency Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example: Social Alice West East ___ Busy Free Free Record Timeline (Network fault,  updt goes to East) (Alice logs on) User Status Alice Busy User Status Alice Free User Status Alice ??? User Status Alice ??? User Status Alice Busy User Status Alice ___
Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write Current version Stale version Stale version Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly  transferred if site fails) Can be selectively weakened to eventual consistency  (local writes that are reconciled using version vectors)
Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Current version Stale version Stale version Read In general, reads are served using a local copy
Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read up-to-date Current version Stale version Stale version But application can request and get current version
OPERABILITY
Distribution Distribution for parallelism Data shuffling for load balancing Server 1 Server 2 Server 3 Server 4 Bike $86 6/2/07 636353 Chair $10 6/5/07 662113 Couch $570 6/1/07 424252 Car $1123 6/1/07 256623 Lamp $19 6/7/07 121113 Bike $56 6/9/07 887734 Scooter $18 6/11/07 252111 Hammer $8000 6/11/07 116458
Tablet Splitting and Balancing Each storage unit has many tablets (horizontal partitions of the table) Tablets may grow over time Overfull tablets split Storage unit may become a hotspot Shed load by moving tablets to other servers Storage unit Tablet
Mastering A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  E C  66354  W D  12352  E E  75656  C F  15677  E C  66354  W B  42521  E A  42342  E D  12352  E E  75656  C F  15677  E
Record vs. Tablet Master A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E Record master serializes updates Tablet master serializes inserts A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  E C  66354  W D  12352  E E  75656  C F  15677  E C  66354  W B  42521  E A  42342  E D  12352  E E  75656  C F  15677  E
Coping With Failures A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E A  42342  E B  42521  W C  66354  W D  12352  E E  75656  C F  15677  E X X OVERRIDE W -> E
COMPARING SYSTEMS
Example ,[object Object],[object Object],Data workloads Read-heavy *but: occasional writes UserID Fname Lname Country Sex Birthday MailLoc Avatar Passwd … bradtm Brad McMillen USA M 7/1/74 /m2/u2 guy3.gif jtr33KKa … cooperb Brian Cooper USA M 6/29/75 /m1/u4 guy2.gif Ffa3rrw1 … sangeeta Sangeetha Seshadri India F 8/13/82 /m3/u8 lady9.gif 3uiU$422 … … … … … … … … … … … OLTP (Random access to a few records)
Which System is Better? ,[object Object],[object Object],[object Object],[object Object],[object Object],UDB ,[object Object],[object Object],[object Object],[object Object],Sherpa
Why System X? ,[object Object],[object Object],[object Object],MySQL  node Disk RAM Read Buffer Page B-Tree index Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page Disk Page
Why Not System Y? ,[object Object],[object Object],[object Object],[object Object],Cassandra node Disk RAM Log SSTable file Memtable Update (later) SSTable file SSTable file
Comparison Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Preliminary Results Sherpa
YCS Benchmark ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],YCSB client DB client Client threads Stats Scenario executor Cloud DB Extensible: plug in new clients Extensible: define new scenarios
Benchmark Tiers ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],If you’re interested in this, please contact me or cooperb@yahoo-inc.com
EDGE, CLOUD SERVING,  AND MORE
Cloud Serving: The Integrated Cloud Big idea:  Declarative language for specifying the  full, end-to-end structure  of a service Key insight:  this “full, end-to-end structure” includes multiple environments Central mechanism:  the  Integrated Cloud ,  which (re)deploys these specifications  ( Surendra Reddy, 2:20 pm, Session 7) Development, multiple testing environments, alpha and bucket-testing environments, production
Foundational Components ,[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],OpenCAP:  Open Cloud Access Protocol
Edge Services ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
YCS Scale YCS handles so much Yahoo traffic, this is noise!
YQL ,[object Object],[object Object],[object Object],[object Object]
YQL ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(Jonathan Trevor, Session 3, 9:10 am)
New in 2010! ,[object Object],[object Object],[object Object],[object Object],[object Object]
TUESDAY, 11/3 9:10 am – 9:55 am Yahoo! Query Language - YQL: Select * from Internet [Including Live Demo!] Dr. Jonathan Trevor  Senior Architect TUESDAY, 11/3 2:20 pm – 3:05 pm Walking through Cloud  Serving at Yahoo! Surendra Reddy VP, Integrated Cloud and Visualization TUESDAY, 11/3 4:50pm – 5:35 pm Hadoop @ Yahoo! – Internet Scale Data Processing Eric Baldeschwieler VP, Hadoop Software Development WEDNESDAY, 11/4 9:10 am - 9:55 am Yahoo! Scalable Storage and Delivery Services Chuck Neerdaels VP, Storage and Edge Services VISIT BOOTH #103  TO TALK WITH YAHOO! ENGINEERS AND LEARN MORE ABOUT YAHOO!’S VISION FOR CLOUD COMPUTING.

Contenu connexe

Tendances

Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudShubham Tagra
 
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech CycleScylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech CycleScyllaDB
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoAlluxio, Inc.
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Bostonkbajda
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio, Inc.
 
What’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementWhat’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementAlluxio, Inc.
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)Ryan Blue
 
Enabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioEnabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioAlluxio, Inc.
 
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group MeetupPresto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group MeetupWojciech Biela
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data PipelineManish Kumar
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedInkbajda
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKKriangkrai Chaonithi
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubolekbajda
 
Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15Zhenxiao Luo
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesDatabricks
 
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow FlightThe Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow FlightDatabricks
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyftkbajda
 

Tendances (20)

Alluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the CloudAlluxio Data Orchestration Platform for the Cloud
Alluxio Data Orchestration Platform for the Cloud
 
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech CycleScylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
 
What’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data managementWhat’s new in Alluxio 2: from seamless operations to structured data management
What’s new in Alluxio 2: from seamless operations to structured data management
 
Spark Core
Spark CoreSpark Core
Spark Core
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
 
Enabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioEnabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with Alluxio
 
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group MeetupPresto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedIn
 
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OK
 
Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubole
 
Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15Presto@Netflix Presto Meetup 03-19-15
Presto@Netflix Presto Meetup 03-19-15
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow FlightThe Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyft
 

Similaire à Key Challenges in Cloud Computing and How Yahoo! is Approaching Them

Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityConSanFrancisco123
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Alluxio, Inc.
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingDibyendu Bhattacharya
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 editionBob Ward
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAndre Essing
 
Avoiding Chaos: Methodology for Managing Performance in a Shared Storage A...
Avoiding Chaos:  Methodology for Managing Performance in a Shared Storage A...Avoiding Chaos:  Methodology for Managing Performance in a Shared Storage A...
Avoiding Chaos: Methodology for Managing Performance in a Shared Storage A...brettallison
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCCal Henderson
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Виталий Стародубцев
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaData Con LA
 
Intuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordIntuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordJAXLondon_Conference
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
Replicate from Oracle to data warehouses and analytics
Replicate from Oracle to data warehouses and analyticsReplicate from Oracle to data warehouses and analytics
Replicate from Oracle to data warehouses and analyticsContinuent
 
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...Amazon Web Services
 
Amazon Aurora TechConnect
Amazon Aurora TechConnect Amazon Aurora TechConnect
Amazon Aurora TechConnect LavanyaMurthy9
 

Similaire à Key Challenges in Cloud Computing and How Yahoo! is Approaching Them (20)

The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
 
IO Dubi Lebel
IO Dubi LebelIO Dubi Lebel
IO Dubi Lebel
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark StreamingNear Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
Near Real time Indexing Kafka Messages to Apache Blur using Spark Streaming
 
Sql server 2016 it just runs faster sql bits 2017 edition
Sql server 2016 it just runs faster   sql bits 2017 editionSql server 2016 it just runs faster   sql bits 2017 edition
Sql server 2016 it just runs faster sql bits 2017 edition
 
Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep Dive
 
Avoiding Chaos: Methodology for Managing Performance in a Shared Storage A...
Avoiding Chaos:  Methodology for Managing Performance in a Shared Storage A...Avoiding Chaos:  Methodology for Managing Performance in a Shared Storage A...
Avoiding Chaos: Methodology for Managing Performance in a Shared Storage A...
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Clustering van IT-componenten
Clustering van IT-componentenClustering van IT-componenten
Clustering van IT-componenten
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 
Intuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin StopfordIntuitions for scaling data centric architectures - Benjamin Stopford
Intuitions for scaling data centric architectures - Benjamin Stopford
 
Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Replicate from Oracle to data warehouses and analytics
Replicate from Oracle to data warehouses and analyticsReplicate from Oracle to data warehouses and analytics
Replicate from Oracle to data warehouses and analytics
 
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
 
Amazon Aurora TechConnect
Amazon Aurora TechConnect Amazon Aurora TechConnect
Amazon Aurora TechConnect
 

Plus de Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaYahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanYahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathYahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuYahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolYahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Yahoo Developer Network
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathYahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsYahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...Yahoo Developer Network
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexYahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsYahoo Developer Network
 

Plus de Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Dernier

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingSelcen Ozturkcan
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Dernier (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central BankingThe Evolution of Money: Digital Transformation and CBDCs in Central Banking
The Evolution of Money: Digital Transformation and CBDCs in Central Banking
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Key Challenges in Cloud Computing and How Yahoo! is Approaching Them

  • 1. Vendor Speech by Raghu Ramakrishnan Cloud Computing Conference & Expo November 3, 2009   Key Challenges in Cloud Computing … and the Yahoo! Approach
  • 2.
  • 3. Inside Yahoo!’s Cloud KEY CHALLENGES
  • 4. DATA MANAGEMENT IN THE CLOUD
  • 5.
  • 6. What Are You Trying to Do? Data Workloads OLTP (Random access to a few records) OLAP (Scan access to a large number of records) Read-heavy Write-heavy By rows By columns Unstructured Combined (Some OLTP and OLAP tasks)
  • 7. Yahoo! Solution Space OLTP (Random access to a few records) OLAP (Scan access to a large number of records) Read-heavy Write-heavy By rows By columns Unstructured Combined (Some OLTP and OLAP tasks) UDS UDB ??? STCache Sherpa Read, read/write Write-heavy Main-memory SQL on Grid Zebra Pig HDFS MapReduce
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14. One Slide Hadoop Primer HDFS Data file Map tasks HDFS Reduce tasks Good for analyzing (scanning) huge files Not great for serving (reading or writing individual objects)
  • 15.
  • 16. Ways of Using Hadoop Data workloads OLAP (Scan access to a large number of records) By rows By columns Unstructured HadoopDB SQL on Grid Zebra
  • 17.
  • 18.
  • 19. SHERPA * Yahoo!’s Serving Store * The system also known as PNUTS
  • 20. What is Sherpa? CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Structured, flexible schemas Hashed and ordered tables Hosted, managed infrastructure E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E
  • 21.
  • 22. Architecture Storage units Routers Tablet Controller REST API Clients Local region Remote regions Tribble
  • 23. Updates Write key k Sequence # for key k Sequence # for key k Write key k SUCCESS Write key k Routers Message brokers 1 2 Write key k 7 8 SU SU SU 3 4 5 6
  • 24. Accessing Data SU SU SU Get key k 1 2 Get key k 3 Record for key k 4 Record for key k
  • 27.
  • 28. Example: Social Alice West East ___ Busy Free Free Record Timeline (Network fault, updt goes to East) (Alice logs on) User Status Alice Busy User Status Alice Free User Status Alice ??? User Status Alice ??? User Status Alice Busy User Status Alice ___
  • 29. Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write Current version Stale version Stale version Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly transferred if site fails) Can be selectively weakened to eventual consistency (local writes that are reconciled using version vectors)
  • 30. Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Current version Stale version Stale version Read In general, reads are served using a local copy
  • 31. Sherpa Consistency Model Time v. 1 v. 2 v. 3 v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read up-to-date Current version Stale version Stale version But application can request and get current version
  • 33. Distribution Distribution for parallelism Data shuffling for load balancing Server 1 Server 2 Server 3 Server 4 Bike $86 6/2/07 636353 Chair $10 6/5/07 662113 Couch $570 6/1/07 424252 Car $1123 6/1/07 256623 Lamp $19 6/7/07 121113 Bike $56 6/9/07 887734 Scooter $18 6/11/07 252111 Hammer $8000 6/11/07 116458
  • 34. Tablet Splitting and Balancing Each storage unit has many tablets (horizontal partitions of the table) Tablets may grow over time Overfull tablets split Storage unit may become a hotspot Shed load by moving tablets to other servers Storage unit Tablet
  • 35. Mastering A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 E C 66354 W D 12352 E E 75656 C F 15677 E C 66354 W B 42521 E A 42342 E D 12352 E E 75656 C F 15677 E
  • 36. Record vs. Tablet Master A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Record master serializes updates Tablet master serializes inserts A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 E C 66354 W D 12352 E E 75656 C F 15677 E C 66354 W B 42521 E A 42342 E D 12352 E E 75656 C F 15677 E
  • 37. Coping With Failures A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E X X OVERRIDE W -> E
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 45.
  • 46.
  • 48. Cloud Serving: The Integrated Cloud Big idea: Declarative language for specifying the full, end-to-end structure of a service Key insight: this “full, end-to-end structure” includes multiple environments Central mechanism: the Integrated Cloud , which (re)deploys these specifications ( Surendra Reddy, 2:20 pm, Session 7) Development, multiple testing environments, alpha and bucket-testing environments, production
  • 49.
  • 50.
  • 51.
  • 52. YCS Scale YCS handles so much Yahoo traffic, this is noise!
  • 53.
  • 54.
  • 55.
  • 56. TUESDAY, 11/3 9:10 am – 9:55 am Yahoo! Query Language - YQL: Select * from Internet [Including Live Demo!] Dr. Jonathan Trevor Senior Architect TUESDAY, 11/3 2:20 pm – 3:05 pm Walking through Cloud Serving at Yahoo! Surendra Reddy VP, Integrated Cloud and Visualization TUESDAY, 11/3 4:50pm – 5:35 pm Hadoop @ Yahoo! – Internet Scale Data Processing Eric Baldeschwieler VP, Hadoop Software Development WEDNESDAY, 11/4 9:10 am - 9:55 am Yahoo! Scalable Storage and Delivery Services Chuck Neerdaels VP, Storage and Edge Services VISIT BOOTH #103 TO TALK WITH YAHOO! ENGINEERS AND LEARN MORE ABOUT YAHOO!’S VISION FOR CLOUD COMPUTING.

Notes de l'éditeur

  1. Will begin by referring to Shelton’s talk, and saying that this talk is about the technical challenges in building and using the clouds he described
  2. Point out that the key challenges cut across the different cloud components
  3. Say that we begin with a closer look at data management issues
  4. Note that this is the non-Mobstor case
  5. Talk about what we’ve done, and explain how design choices address the key challenges
  6. “ Free” update goes to east coast first because of network disruption, and as it happens, this happens before original busy notification is received. Now the west coast receives the corresponding free notification while the east coast receives the original busy notification. No priority between these two outcomes …
  7. Yahoo! thought leadership in this space
  8. Updates write the whole record
  9. Yahoo! thought leadership in this space
  10. Abstracts concerns of the underlying infrastructure and the network communication •Virtualized hardware •Declarative application structure •End-to-End Security •Standardized software stacks and packaging •Integrated service management •Continuous Integration and Deployment •Containers that service requests as opposed to Machines that run executables •Elastic serving of changing workloads •Controlled/Intelligent traffic direction •Controlled execution environment •Managed Communication •Service Associations, Bindings, and Access Controls
  11. Common mechanism to orchestrate resources in the Cloud including nodes storage devices, load balancers, security identities, and services themselves (IETF proposal) - Also, moving services within and across clouds Clouds may be partitioned into independently administered "regions" (environment in which services and resources can interact) to facilitate load balancing and ownership
  12. over 3,500 transactions/second per YTS box Ysquid – used in cases where the is a clear need based on features….does not scale per YTS in most cases (runs on single core, single threaded) YCPI, accelerates dynamic content internationally?
  13. US only Much or Y! runs through YCS in US YCPI
  14. Quick plug for Jonathan’s talk—and make the point that while Yahoo!’s cloud is currently private, it enables externally available apps, including some that selectively expose cloud capabilities to outside developers