SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
How Microsoft built
and scaled Cosmos
Cosmos
— Cosmos is a large Scale Data processing system
— In use by thousands of internal users at Microsoft
— Distributed filesystem contains exabytes of data
— High-level SQL-like language to run jobs
processing up to petabytes at a time
Outline
— What made Cosmos successful
— Language
— Data sharing
— Technical Challenges
— Scalability challenges and architecture
— Supporting lower latency workload
— Conclusion
Language: Scope
— SQL-Like language
— Support structured data and unstructured data
— Easy to use and learn
Q = SSTREAM “queries.ss”;
U = SSTREAM “users.ss”;
J= SELECT *, Math.Round(Q.latency) AS l
FROM Q,U WHERE Q.uid==U.uid;
OUTPUT J TO “output.txt”
“SCOPE: Parallel Databases Meet MapReduce” Jingren Zhou, Nicolas Bruno, Ming-chuan Wu, Paul Larson, Ronnie Chaiken, Darren Shakib,
The VLDB Journal, 2012
Scope
— C# extensibility
— Supports user defined objects
input = EXTRACT user, session, blob
FROM "log_%n.txt?n=1...10"
USING DefaultTextExtractor;
SELECT user, session,
new RequestInfo(blob) AS request
FROM input
WHERE
request.Browser.IsChrome()
“SCOPE: Parallel Databases Meet MapReduce” Jingren Zhou, Nicolas Bruno, Ming-chuan Wu, Paul Larson, Ronnie Chaiken, Darren Shakib,
The VLDB Journal, 2012
Scope Distributed Execution
— Queries are parsed into a logical operator tree
— The optimizer transforms the query into a physical
operator graph, which is then compiled into
binaries
— The physical operator graph and binaries are
handed to a scheduler for execution
“Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping
Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
Scope Optimizer
Data Sharing
— Users share data by reference
— Teams put their data in Cosmos because that is
where the data they want to join against is
— Skype, Windows, Xbox, Bing, Ads, Office, and
more
http://research.microsoft.com/en-us/events/fs2011/helland_cosmos_big_data_and_big_challenges.pdf
https://azure.microsoft.com/en-us/blog/behind-the-scenes-of-azure-data-lake-bringing-microsoft-s-big-data-experience-to-
hadoop/
“Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping
Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
Network Effect
• Teams put their data in Cosmos because that is
where the data they want to join against is
http://research.microsoft.com/en-us/events/fs2011/helland_cosmos_big_data_and_big_challenges.pdf
JETS operates a high-scale, modern data pipeline for Office
Telemetry data from clients and services are combined into both
custom (app domain specific) and common System Health data
sets in Cosmos.
organization reports to surface [..] release risks and telemetry information
telemetry information using map/reduce COSMOS
WSD organization is responsible for delivering security and non-
security fixes to Windows OSes to billions of customers, every
month on patch Tuesday
Are you interested in building the BI platform for Bing Ads?
Experience with working on C#, C++, or Java, Cosmos, is highly
desirable.
Are you excited about delivering the next generation personal
assistant, Cortana, to millions of people using Windows worldwide?
Experience with“Big Data” technologies like Cosmos
Data Sharing
— Users share data by reference
— Teams put their data in Cosmos because that is
where the data they want to join against is
— Skype, Windows, Xbox, Bing, Ads, Office, and
more
— This drives huge scalability requirements
— Cluster size exceed 50,000 servers
http://research.microsoft.com/en-us/events/fs2011/helland_cosmos_big_data_and_big_challenges.pdf
https://azure.microsoft.com/en-us/blog/behind-the-scenes-of-azure-data-lake-bringing-microsoft-s-big-data-experience-to-
hadoop/
“Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping
Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
Outline
— What made Cosmos successful
— Language
— Data sharing
— Technical Challenges
— Scalability challenges and architecture
— Supporting lower latency workload
— Conclusion
Plan Optimizations
— At large scale, query plan manipulations are
required to improve efficiency of sort, aggregation
and broadcast
Aggregation
Broadcast Joins
Parallel Sort
Scaling the Execution:
Apollo (OSDI’14)
— A large number of users share execution resources
for data locality
— How to minimize latency while maximizing cluster
utilization?
— Challenges:
— Scale
— Heterogeneous workload
— Maximizing utilization
“Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping
Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
Heterogeneous Workload
Dynamic Workload
How to effectively use resources while maintaining
performance guarantees with a dynamic workload?
Architecture
— For scalability, the architecture adopts a fully
decentralized control plane
— Each job has its own scheduler instance
— Each scheduler is making independent decisions
informed by global information
Architecture
•
Scheduler:
There is one scheduler per job for
scalability
The scheduler makes local
decision and directly dispatch
tasks to process nodes
Architecture
Process Nodes:
Execute tasks on behalf of job
managers
Provides local resource isolation
Send status update aggregated
by a resource monitor
Architecture
Resource Monitor:
Aggregates status information
from process node
Provides the cluster load
information to schedulers to
inform future scheduling
Architecture
The queue at the PN allows the scheduler
to reason about future resource availability
Representing Load
— How to concisely represent load?
— Represents the expected wait time to
acquire resources
— Integrated into a scheduler cost model
Optimizing for various
factors
Optimizing for various
factors
— To make optimal scheduling decisions, multiple factors have to be
considered at the same time
— Input location
— Network topology
— Wait time
— Initialization time
— Machine health, probability of failure
Scheduler Performance
Ideal scheduler (Capacity Constraint)
Ideal Scheduler (Infinite Capacity)
Baseline
Apollo
The Cosmos scheduler performs within 5% of the
ideal trace driven scheduler
Utilization
Cosmos maintains a median utilization above 80%
on weekdays while supporting latency-sensitive
workloads
More in the paper
— Scheduler cost model
— Opportunistic scheduling
— Stable matching
“Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing”
Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian,
Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
Outline
— What made Cosmos successful
— Language
— Data sharing
— Technical Challenges
— Scalability challenges and architecture
— Supporting lower latency workload
— Conclusion
Supporting lower latency
workloads
— As the customer base increased, the workload
diversified
— Users request the ability to get interactive
latencies, on the same data
— While Apollo can scale to jobs processing
petabytes of data, it has undesirable overhead for
smaller jobs
Supporting lower latency
workloads
— How to provide interactive latencies at cloud scale?
— How to provide fault tolerance in an interactive
context?
JetScope (VLDB ’15)
— Provide interactive capabilities on Cosmos &
Scope
— Paradigm shift in the execution model:
— Stream intermediate results
— Gang scheduling
Intermediate Results
Streaming
— JetScope avoids materializing intermediates to disk
— Tasks writes to a local service, StreamNet, which
manages communications
— Challenges:
— Deadlock on ordered merge when using finite
communication buffers
— Too many connections
Gang Scheduling
— To achieve minimal latency, JetScope starts all
tasks at the same time (gang scheduling)
— Execution overlap in tasks allows an increase in
parallelism
— Challenge: Scheduler deadlock
— Two schedulers incrementally acquire resources
— Resources run out, neither jobs can execute
— Solution: Admission control
—Chance of failure increases with number of
servers touched
—A job could fail repeatedly and never complete
—We need a fault tolerance mechanism that
doesn’t impact performance
—Details are in the paper
39
Fault Tolerance
How does JetScope scale?
Latency(seconds)
0
13
25
38
50
Q1 Q4 Q6 Q12 Q15
10TB with 200 servers 1TB with 20 servers
Similar latency
after 10x scale
increase
40
Conclusion
—Cosmos is a large scale distributed data processing
system
—Store exabytes of data on many clusters, that can
contain over 50,000 servers
—Provides both batch processing and interactive
processing
—Has a fully decentralized control plane for scalability
—Operates a high utilization to maintain low query
cost
41

Contenu connexe

Tendances

Achieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environmentAchieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environmentRakuten Group, Inc.
 
Introducing MemSQL 4
Introducing MemSQL 4Introducing MemSQL 4
Introducing MemSQL 4SingleStore
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkSingleStore
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafkaconfluent
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthSingleStore
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLSingleStore
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIconfluent
 
The Future of ETL - Strata Data New York 2018
The Future of ETL - Strata Data New York 2018The Future of ETL - Strata Data New York 2018
The Future of ETL - Strata Data New York 2018confluent
 
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...How to use Standard SQL over Kafka: From the basics to advanced use cases | F...
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...HostedbyConfluent
 
The Future of ETL Isn't What It Used to Be
The Future of ETL Isn't What It Used to BeThe Future of ETL Isn't What It Used to Be
The Future of ETL Isn't What It Used to Beconfluent
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analyticsconfluent
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQLSingleStore
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...confluent
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...HostedbyConfluent
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...HostedbyConfluent
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...HostedbyConfluent
 
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on KubernetesStateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetesconfluent
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...confluent
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®confluent
 

Tendances (20)

Achieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environmentAchieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environment
 
Introducing MemSQL 4
Introducing MemSQL 4Introducing MemSQL 4
Introducing MemSQL 4
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and Spark
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme Growth
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
 
The Future of ETL - Strata Data New York 2018
The Future of ETL - Strata Data New York 2018The Future of ETL - Strata Data New York 2018
The Future of ETL - Strata Data New York 2018
 
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...How to use Standard SQL over Kafka: From the basics to advanced use cases | F...
How to use Standard SQL over Kafka: From the basics to advanced use cases | F...
 
The Future of ETL Isn't What It Used to Be
The Future of ETL Isn't What It Used to BeThe Future of ETL Isn't What It Used to Be
The Future of ETL Isn't What It Used to Be
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
 
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
Low-latency data applications with Kafka and Agg indexes | Tino Tereshko, Fir...
 
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on KubernetesStateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 

Similaire à How Microsoft built and scaled its Cosmos distributed data system

IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016Vaidheswaran CS
 
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSManage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSMesosphere Inc.
 
云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用lantianlcdx
 
FredMcLainResumeB
FredMcLainResumeBFredMcLainResumeB
FredMcLainResumeBFred McLain
 
Above the cloud joarder kamal
Above the cloud   joarder kamalAbove the cloud   joarder kamal
Above the cloud joarder kamalJoarder Kamal
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxRavi Yadav
 
ZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsGokhan Boranalp
 
Scientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & FutureScientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & Futurestratuslab
 
QoE-Aware Traffic Steering using OpenFlow
QoE-Aware Traffic Steering using OpenFlowQoE-Aware Traffic Steering using OpenFlow
QoE-Aware Traffic Steering using OpenFlowUS-Ignite
 
Karthik Balasubramanian (Resume)
Karthik Balasubramanian (Resume)Karthik Balasubramanian (Resume)
Karthik Balasubramanian (Resume)karthik_bala
 
Designing High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCDesigning High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCObject Automation
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.pptVipin Singhal
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.pptgeminass1
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud ComputingAnimesh Chaturvedi
 

Similaire à How Microsoft built and scaled its Cosmos distributed data system (20)

IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016IT TRENDS AND PERSPECTIVES 2016
IT TRENDS AND PERSPECTIVES 2016
 
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OSManage Microservices & Fast Data Systems on One Platform w/ DC/OS
Manage Microservices & Fast Data Systems on One Platform w/ DC/OS
 
云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用
 
Mihai_Nuta
Mihai_NutaMihai_Nuta
Mihai_Nuta
 
FredMcLainResumeB
FredMcLainResumeBFredMcLainResumeB
FredMcLainResumeB
 
Above the cloud joarder kamal
Above the cloud   joarder kamalAbove the cloud   joarder kamal
Above the cloud joarder kamal
 
Containerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptxContainerizing couchbase with microservice architecture on mesosphere.pptx
Containerizing couchbase with microservice architecture on mesosphere.pptx
 
ZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed SystemsZCloud Consensus on Hardware for Distributed Systems
ZCloud Consensus on Hardware for Distributed Systems
 
newSkills_09
newSkills_09newSkills_09
newSkills_09
 
Scientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & FutureScientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & Future
 
Santosh kumarpandi
Santosh kumarpandiSantosh kumarpandi
Santosh kumarpandi
 
QoE-Aware Traffic Steering using OpenFlow
QoE-Aware Traffic Steering using OpenFlowQoE-Aware Traffic Steering using OpenFlow
QoE-Aware Traffic Steering using OpenFlow
 
Karthik Balasubramanian (Resume)
Karthik Balasubramanian (Resume)Karthik Balasubramanian (Resume)
Karthik Balasubramanian (Resume)
 
Designing High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPCDesigning High performance & Scalable Middleware for HPC
Designing High performance & Scalable Middleware for HPC
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.ppt
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.ppt
 
CloudComputingJun28.ppt
CloudComputingJun28.pptCloudComputingJun28.ppt
CloudComputingJun28.ppt
 
Naresh Babu
Naresh BabuNaresh Babu
Naresh Babu
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud Computing
 
Ibm innovate ci for system z
Ibm innovate ci for system zIbm innovate ci for system z
Ibm innovate ci for system z
 

Plus de SingleStore

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeSingleStore
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsSingleStore
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemSingleStore
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeSingleStore
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics SingleStore
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLSingleStore
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastSingleStore
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsSingleStore
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureSingleStore
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored ProceduresSingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017SingleStore
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming DataSingleStore
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSingleStore
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondSingleStore
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementSingleStore
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AISingleStore
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudSingleStore
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataSingleStore
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSingleStore
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleSingleStore
 

Plus de SingleStore (20)

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQL
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
An Engineering Approach to Database Evaluations
An Engineering Approach to Database EvaluationsAn Engineering Approach to Database Evaluations
An Engineering Approach to Database Evaluations
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Image Recognition on Streaming Data
Image Recognition  on Streaming DataImage Recognition  on Streaming Data
Image Recognition on Streaming Data
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and Beyond
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
 
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid CloudGartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
 

Dernier

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Dernier (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

How Microsoft built and scaled its Cosmos distributed data system

  • 1. How Microsoft built and scaled Cosmos
  • 2. Cosmos — Cosmos is a large Scale Data processing system — In use by thousands of internal users at Microsoft — Distributed filesystem contains exabytes of data — High-level SQL-like language to run jobs processing up to petabytes at a time
  • 3. Outline — What made Cosmos successful — Language — Data sharing — Technical Challenges — Scalability challenges and architecture — Supporting lower latency workload — Conclusion
  • 4. Language: Scope — SQL-Like language — Support structured data and unstructured data — Easy to use and learn Q = SSTREAM “queries.ss”; U = SSTREAM “users.ss”; J= SELECT *, Math.Round(Q.latency) AS l FROM Q,U WHERE Q.uid==U.uid; OUTPUT J TO “output.txt” “SCOPE: Parallel Databases Meet MapReduce” Jingren Zhou, Nicolas Bruno, Ming-chuan Wu, Paul Larson, Ronnie Chaiken, Darren Shakib, The VLDB Journal, 2012
  • 5. Scope — C# extensibility — Supports user defined objects input = EXTRACT user, session, blob FROM "log_%n.txt?n=1...10" USING DefaultTextExtractor; SELECT user, session, new RequestInfo(blob) AS request FROM input WHERE request.Browser.IsChrome() “SCOPE: Parallel Databases Meet MapReduce” Jingren Zhou, Nicolas Bruno, Ming-chuan Wu, Paul Larson, Ronnie Chaiken, Darren Shakib, The VLDB Journal, 2012
  • 6. Scope Distributed Execution — Queries are parsed into a logical operator tree — The optimizer transforms the query into a physical operator graph, which is then compiled into binaries — The physical operator graph and binaries are handed to a scheduler for execution “Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
  • 8. Data Sharing — Users share data by reference — Teams put their data in Cosmos because that is where the data they want to join against is — Skype, Windows, Xbox, Bing, Ads, Office, and more http://research.microsoft.com/en-us/events/fs2011/helland_cosmos_big_data_and_big_challenges.pdf https://azure.microsoft.com/en-us/blog/behind-the-scenes-of-azure-data-lake-bringing-microsoft-s-big-data-experience-to- hadoop/ “Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
  • 9. Network Effect • Teams put their data in Cosmos because that is where the data they want to join against is http://research.microsoft.com/en-us/events/fs2011/helland_cosmos_big_data_and_big_challenges.pdf JETS operates a high-scale, modern data pipeline for Office Telemetry data from clients and services are combined into both custom (app domain specific) and common System Health data sets in Cosmos.
  • 10. organization reports to surface [..] release risks and telemetry information telemetry information using map/reduce COSMOS WSD organization is responsible for delivering security and non- security fixes to Windows OSes to billions of customers, every month on patch Tuesday
  • 11. Are you interested in building the BI platform for Bing Ads? Experience with working on C#, C++, or Java, Cosmos, is highly desirable.
  • 12. Are you excited about delivering the next generation personal assistant, Cortana, to millions of people using Windows worldwide? Experience with“Big Data” technologies like Cosmos
  • 13. Data Sharing — Users share data by reference — Teams put their data in Cosmos because that is where the data they want to join against is — Skype, Windows, Xbox, Bing, Ads, Office, and more — This drives huge scalability requirements — Cluster size exceed 50,000 servers http://research.microsoft.com/en-us/events/fs2011/helland_cosmos_big_data_and_big_challenges.pdf https://azure.microsoft.com/en-us/blog/behind-the-scenes-of-azure-data-lake-bringing-microsoft-s-big-data-experience-to- hadoop/ “Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
  • 14. Outline — What made Cosmos successful — Language — Data sharing — Technical Challenges — Scalability challenges and architecture — Supporting lower latency workload — Conclusion
  • 15. Plan Optimizations — At large scale, query plan manipulations are required to improve efficiency of sort, aggregation and broadcast
  • 19. Scaling the Execution: Apollo (OSDI’14) — A large number of users share execution resources for data locality — How to minimize latency while maximizing cluster utilization? — Challenges: — Scale — Heterogeneous workload — Maximizing utilization “Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
  • 21. Dynamic Workload How to effectively use resources while maintaining performance guarantees with a dynamic workload?
  • 22. Architecture — For scalability, the architecture adopts a fully decentralized control plane — Each job has its own scheduler instance — Each scheduler is making independent decisions informed by global information
  • 23. Architecture • Scheduler: There is one scheduler per job for scalability The scheduler makes local decision and directly dispatch tasks to process nodes
  • 24. Architecture Process Nodes: Execute tasks on behalf of job managers Provides local resource isolation Send status update aggregated by a resource monitor
  • 25. Architecture Resource Monitor: Aggregates status information from process node Provides the cluster load information to schedulers to inform future scheduling
  • 26. Architecture The queue at the PN allows the scheduler to reason about future resource availability
  • 27. Representing Load — How to concisely represent load? — Represents the expected wait time to acquire resources — Integrated into a scheduler cost model
  • 29. Optimizing for various factors — To make optimal scheduling decisions, multiple factors have to be considered at the same time — Input location — Network topology — Wait time — Initialization time — Machine health, probability of failure
  • 30. Scheduler Performance Ideal scheduler (Capacity Constraint) Ideal Scheduler (Infinite Capacity) Baseline Apollo The Cosmos scheduler performs within 5% of the ideal trace driven scheduler
  • 31. Utilization Cosmos maintains a median utilization above 80% on weekdays while supporting latency-sensitive workloads
  • 32. More in the paper — Scheduler cost model — Opportunistic scheduling — Stable matching “Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing” Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou, in Proc. of the 2014 OSDI Conference (OSDI'14)
  • 33. Outline — What made Cosmos successful — Language — Data sharing — Technical Challenges — Scalability challenges and architecture — Supporting lower latency workload — Conclusion
  • 34. Supporting lower latency workloads — As the customer base increased, the workload diversified — Users request the ability to get interactive latencies, on the same data — While Apollo can scale to jobs processing petabytes of data, it has undesirable overhead for smaller jobs
  • 35. Supporting lower latency workloads — How to provide interactive latencies at cloud scale? — How to provide fault tolerance in an interactive context?
  • 36. JetScope (VLDB ’15) — Provide interactive capabilities on Cosmos & Scope — Paradigm shift in the execution model: — Stream intermediate results — Gang scheduling
  • 37. Intermediate Results Streaming — JetScope avoids materializing intermediates to disk — Tasks writes to a local service, StreamNet, which manages communications — Challenges: — Deadlock on ordered merge when using finite communication buffers — Too many connections
  • 38. Gang Scheduling — To achieve minimal latency, JetScope starts all tasks at the same time (gang scheduling) — Execution overlap in tasks allows an increase in parallelism — Challenge: Scheduler deadlock — Two schedulers incrementally acquire resources — Resources run out, neither jobs can execute — Solution: Admission control
  • 39. —Chance of failure increases with number of servers touched —A job could fail repeatedly and never complete —We need a fault tolerance mechanism that doesn’t impact performance —Details are in the paper 39 Fault Tolerance
  • 40. How does JetScope scale? Latency(seconds) 0 13 25 38 50 Q1 Q4 Q6 Q12 Q15 10TB with 200 servers 1TB with 20 servers Similar latency after 10x scale increase 40
  • 41. Conclusion —Cosmos is a large scale distributed data processing system —Store exabytes of data on many clusters, that can contain over 50,000 servers —Provides both batch processing and interactive processing —Has a fully decentralized control plane for scalability —Operates a high utilization to maintain low query cost 41