SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Datacenter Computing
with Apache Mesos
CHUG @ Sears Holdings Corp,
2013-11-06
Paco Nathan, chief scientist
http://Mesosphere.io
@pacoid
The Modern Kernel: Top Linux Contributors…
Beyond Hadoop
Google has been doing datacenter computing for years,
to address the complexities of large-scale data workflows:

•

leveraging the modern kernel: isolation in lieu of VMs

•

“most (>80%) jobs are batch jobs, but the majority
of resources (55–80%) are allocated to service jobs”

•

mixed workloads, multi-tenancy

•

relatively high utilization rates

•

JVM? not so much…

•

reality: scheduling batch is simple;
scheduling services is hard/expensive
Beyond Hadoop
Hadoop – an open source solution for fault-tolerant
parallel processing of batch jobs at scale, based on
commodity hardware… however, other priorities have
emerged for the analytics lifecycle:

•
•
•

apps require integration beyond Hadoop

•
•
•
•

higher utilization

multiple topologies, mixed workloads, multi-tenancy
significant disruptions in h/w cost/performance
curves
lower latency
highly-available, long running services
more than “Just JVM” – e.g., Python growth

keep in mind the priority for multi-disciplinary efforts,
to break down silos – extending beyond a de facto
“priesthood” of data engineering
“Return of the Borg”
Return of the Borg: How Twitter Rebuilt Google’s Secret Weapon
Cade Metz
wired.com/wiredenterprise/2013/03/googleborg-twitter-mesos
The Datacenter as a Computer: An Introduction
to the Design of Warehouse-Scale Machines
Luiz André Barroso, Urs Hölzle
research.google.com/pubs/pub35290.html
2011 GAFS Omega
John Wilkes, et al.
youtu.be/0ZFMlO98Jkc
“Return of the Borg”
Omega: flexible, scalable schedulers for large compute clusters
Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, John Wilkes
eurosys2013.tudos.org/wp-content/uploads/2013/paper/
Schwarzkopf.pdf
Google describes the business case…
Taming Latency Variability
Jeff Dean
plus.google.com/u/0/+ResearchatGoogle/posts/C1dPhQhcDRv
Mesos – definitions
a common substrate for cluster computing
mesos.apache.org
heterogenous assets in your datacenter or cloud
made available as a homogenous set of resources

•
•
•
•
•
•
•
•

top-level Apache project
scalability to 10,000s of nodes
obviates the need for virtual machines
isolation (pluggable) for CPU, RAM, I/O, FS, etc.
fault-tolerant leader election based on ZooKeeper
APIs in C++, Java, Python
web UI for inspecting cluster state
available for Linux, OpenSolaris, Mac OSX
Mesos – architecture
given the use of Mesos as a Datacenter OS kernel…

•

Chronos provides complex scheduling capabilities,
much like a distributed Unix “cron”

•

Marathon provides highly-available long-running
services, much like a distributed Unix “init.d”

•

next time you need to build a distributed app,
consider using these as building blocks

a major lesson learned from Spark:

•

leveraging these kinds of building blocks,
one can rebuild Hadoop 100x faster,
in much less code
Mesos – architecture
services

batch

Workloads

Apps
Scalding

MPI

Impala

Hadoop

Shark

Spark

MySQL

Kafka

JBoss

Django

Chronos

Storm

Rails

Frameworks

Py

th
on

Marathon

C

++

JV

M

Kernel

distributed file system

distributed resources: CPU, RAM, I/O, FS, rack locality, etc.

DFS

Cluster
Mesos – datacenter OS stack

HADOOP
TELEMETRY

STORM

CHRONOS

RAILS

CAPACITY PLANNING SMARTER SCHEDULING

MESOS

SECURITY

JBOSS
GUI

Apps
OS
Kernel
Prior Practice: Dedicated Servers

DATACENTER

•
•

low utilization rates
longer time to ramp up new services
Prior Practice: Virtualization

DATACENTER

PROVISIONED VMS

•
•

even more machines to manage

•

VM licensing costs

substantial performance decrease
due to virtualization
Prior Practice: Static Partitioning

DATACENTER

STATIC PARTITIONING

•
•

even more machines to manage

•
•

VM licensing costs

substantial performance decrease
due to virtualization
static partitioning limits elasticity
Mesos: One Large Pool Of Resources

DATACENTER

MESOS

“We wanted people to be able to program
for the datacenter just like they program
for their laptop."
Ben Hindman
Arguments for Datacenter Computing
rather than running several specialized clusters, each
at relatively low utilization rates, instead run many
mixed workloads
obvious benefits are realized in terms of:

•
•
•

scalability, elasticity, fault tolerance, performance, utilization
reduced equipment cap­ex, Ops overhead, etc.
reduced licensing, eliminating need for VMs or potential
vendor lock­in

subtle benefits – arguably, more important for Enterprise IT:

•
•

reduced time for engineers to ramp­up new services at scale

•

enables Dev/Test apps to run safely on a Production cluster

reduced latency between batch and services, enabling new
high­ROI use cases
What are the costs of Virtualization?
benchmark
type

OpenVZ
improvement

mixed workloads

210%-300%

LAMP (related)

38%-200%

I/O throughput

200%-500%

response time

order magnitude

more pronounced
at higher loads
What are the costs of Single Tenancy?
MEMCACHED
CPU LOAD

RAILS CPU
LOAD

HADOOP CPU
LOAD

100%

100%

100%

75%

75%

75%

50%

50%

50%

25%

25%

25%

0%

0%

0%

t

t

COMBINED CPU LOAD (RAILS,
MEMCACHED, HADOOP)
100%

75%

50%

25%

0%

Hadoop
Memcached
Rails
Example: Resource Offer in a Two-Level Scheduler

mesos.apache.org/documentation/latest/mesos-architecture/
Example: Docker on Mesos

Mesos
master servers
Master
Docker
Registry

Mesos
slave servers

S
M

S
docker

S

…

index.docker.io

S
Marathon can launch and monitor
service containers from one or
more Docker registries, using
the Docker executor for Mesos

M

S
docker

S

…

marathon

S
M

…

…

S

S

…

docker

…

Local
Docker
Registry

S

S

S

…

( optional )

mesosphere.io/2013/09/26/docker-on-mesos/
Example: Docker on Mesos

Mesos Master Server
init
|
+ mesos-master
|
+ marathon
|

The executor, monitored by the
Mesos slave, delegates to the
local Docker daemon for image
discovery and management. The
executor communicates with
Marathon via the Mesos master
and ensures that Docker enforces
the specified resource limitations.

Mesos Slave Server
init
|
+ docker
| |
| + lxc
|
|
|
+ (user task, under container init system)
|
|
|
+ mesos-slave
| |
| + /var/lib/mesos/executors/docker
| | |
| | + docker run …
| | |

mesosphere.io/2013/09/26/docker-on-mesos/
Example: Docker on Mesos

Mesos Master Server
init
|
+ mesos-master
|
+ marathon
|

Mesos, LXC, and
Docker are tied
together for launch

Mesos Slave Server
3
2

1
6
When a user requests
a container…

Docker
Registry

7

5
init
|
+ docker
8
| |
| + lxc
|
|
|
+ (user task, under container init system)
|
|
|
4
+ mesos-slave
| |
| + /var/lib/mesos/executors/docker
| | |
| | + docker run …
| | |

mesosphere.io/2013/09/26/docker-on-mesos/
Production Deployments (public)
Opposite Ends of the Spectrum, One Common Substrate

Solaris Zones
Built-in /
bare metal
Linux CGroups

Hypervisors
Opposite Ends of the Spectrum, One Common Substrate

Request /
Response

Batch
Case Study: Twitter (bare metal / on premise)
“Mesos is the cornerstone of our elastic compute infrastructure –
it’s how we build all our new services and is critical for Twitter’s
continued success at scale. It's one of the primary keys to our
data center efficiency."
Chris Fry, SVP Engineering
blog.twitter.com/2013/mesos-graduates-from-apache-incubation

•

key services run in production: analytics, typeahead, ads

•

Twitter engineers rely on Mesos to build all new services

•

instead of thinking about static machines, engineers think
about resources like CPU, memory and disk

•

allows services to scale and leverage a shared pool of
servers across datacenters efficiently

•

reduces the time between prototyping and launching
Case Study: Airbnb (fungible cloud infrastructure)
“We think we might be pushing data science in the field of travel
more so than anyone has ever done before… a smaller number
of engineers can have higher impact through automation on
Mesos."
Mike Curtis,VP Engineering
gigaom.com/2013/07/29/airbnb-is-engineering-itself-into-a-data...

•

improves resource management and efficiency

•

helps advance engineering strategy of building small teams
that can move fast

•

key to letting engineers make the most of AWS-based
infrastructure beyond just Hadoop

•

allowed company to migrate off Elastic MapReduce

•

enables use of Hadoop along with Chronos, Spark, Storm, etc.
Media Coverage
Mesosphere Adds Docker Support To Its Mesos-Based Operating System For The Data Center
Frederic Lardinois
TechCrunch (2013-09-26)
techcrunch.com/2013/09/26/mesosphere...
Play Framework Grid Deployment with Mesos
James Ward, Flo Leibert, et al.
Typesafe blog (2013-09-19)
typesafe.com/blog/play-framework-grid...
Mesosphere Launches Marathon Framework
Adrian Bridgwater
Dr. Dobbs (2013-09-18)
drdobbs.com/open-source/mesosphere...
New open source tech Marathon wants to make your data center run like Google’s
Derrick Harris
GigaOM (2013-09-04)
gigaom.com/2013/09/04/new-open-source...
Running batch and long-running, highly available service jobs on the same cluster
Ben Lorica
O’Reilly (2013-09-01)
strata.oreilly.com/2013/09/running-batch...
Resources
Apache Mesos Project
mesos.apache.org
Mesosphere
mesosphere.io
Tutorials
mesosphere.io/learn
Documentation
mesos.apache.org/documentation
2011 USENIX Research Paper
usenix.org/legacy/event/nsdi11/tech/full_papers/
Hindman_new.pdf
Collected Notes/Archives
goo.gl/jPtTP
Join us!
Mesos Unconference @Twitter HQ, SF
Nov 19
eventbrite.com/event/9104464699
Data Day Texas, Austin
Jan 11
2014.datadaytexas.com
(incl. Mesos talk+BOF)
O’Reilly Strata, Santa Clara
Feb 11-13
strataconf.com/strata2014
(incl. Mesos talk+BOF+workshop)
Enterprise Data Workflows with Cascading
O’Reilly, 2013
shop.oreilly.com/product/
0636920028536.do
monthly newsletter for updates,
events, conference summaries, etc.:
liber118.com/pxn/

Contenu connexe

Plus de Paco Nathan

Humans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryHumans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryPaco Nathan
 
Computable Content
Computable ContentComputable Content
Computable ContentPaco Nathan
 
Computable Content: Lessons Learned
Computable Content: Lessons LearnedComputable Content: Lessons Learned
Computable Content: Lessons LearnedPaco Nathan
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonPaco Nathan
 
Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsPaco Nathan
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving UpPaco Nathan
 
Data Science Reinvents Learning?
Data Science Reinvents Learning?Data Science Reinvents Learning?
Data Science Reinvents Learning?Paco Nathan
 
Jupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusJupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusPaco Nathan
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataPaco Nathan
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learningPaco Nathan
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesPaco Nathan
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in SparkPaco Nathan
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataPaco Nathan
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingPaco Nathan
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MorePaco Nathan
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedPaco Nathan
 
Microservices, Containers, and Machine Learning
Microservices, Containers, and Machine LearningMicroservices, Containers, and Machine Learning
Microservices, Containers, and Machine LearningPaco Nathan
 
Databricks Meetup @ Los Angeles Apache Spark User Group
Databricks Meetup @ Los Angeles Apache Spark User GroupDatabricks Meetup @ Los Angeles Apache Spark User Group
Databricks Meetup @ Los Angeles Apache Spark User GroupPaco Nathan
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapePaco Nathan
 
What's new with Apache Spark?
What's new with Apache Spark?What's new with Apache Spark?
What's new with Apache Spark?Paco Nathan
 

Plus de Paco Nathan (20)

Humans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryHumans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industry
 
Computable Content
Computable ContentComputable Content
Computable Content
 
Computable Content: Lessons Learned
Computable Content: Lessons LearnedComputable Content: Lessons Learned
Computable Content: Lessons Learned
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in Python
 
Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analytics
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
Data Science Reinvents Learning?
Data Science Reinvents Learning?Data Science Reinvents Learning?
Data Science Reinvents Learning?
 
Jupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusJupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and Erasmus
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About Data
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big Data
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark Streaming
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML Unpaused
 
Microservices, Containers, and Machine Learning
Microservices, Containers, and Machine LearningMicroservices, Containers, and Machine Learning
Microservices, Containers, and Machine Learning
 
Databricks Meetup @ Los Angeles Apache Spark User Group
Databricks Meetup @ Los Angeles Apache Spark User GroupDatabricks Meetup @ Los Angeles Apache Spark User Group
Databricks Meetup @ Los Angeles Apache Spark User Group
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscape
 
What's new with Apache Spark?
What's new with Apache Spark?What's new with Apache Spark?
What's new with Apache Spark?
 

Dernier

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Dernier (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

CHUG: Datacenter Computing with Apache Mesos

  • 1. Datacenter Computing with Apache Mesos CHUG @ Sears Holdings Corp, 2013-11-06 Paco Nathan, chief scientist http://Mesosphere.io @pacoid
  • 2. The Modern Kernel: Top Linux Contributors…
  • 3. Beyond Hadoop Google has been doing datacenter computing for years, to address the complexities of large-scale data workflows: • leveraging the modern kernel: isolation in lieu of VMs • “most (>80%) jobs are batch jobs, but the majority of resources (55–80%) are allocated to service jobs” • mixed workloads, multi-tenancy • relatively high utilization rates • JVM? not so much… • reality: scheduling batch is simple; scheduling services is hard/expensive
  • 4. Beyond Hadoop Hadoop – an open source solution for fault-tolerant parallel processing of batch jobs at scale, based on commodity hardware… however, other priorities have emerged for the analytics lifecycle: • • • apps require integration beyond Hadoop • • • • higher utilization multiple topologies, mixed workloads, multi-tenancy significant disruptions in h/w cost/performance curves lower latency highly-available, long running services more than “Just JVM” – e.g., Python growth keep in mind the priority for multi-disciplinary efforts, to break down silos – extending beyond a de facto “priesthood” of data engineering
  • 5. “Return of the Borg” Return of the Borg: How Twitter Rebuilt Google’s Secret Weapon Cade Metz wired.com/wiredenterprise/2013/03/googleborg-twitter-mesos The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines Luiz André Barroso, Urs Hölzle research.google.com/pubs/pub35290.html 2011 GAFS Omega John Wilkes, et al. youtu.be/0ZFMlO98Jkc
  • 6. “Return of the Borg” Omega: flexible, scalable schedulers for large compute clusters Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, John Wilkes eurosys2013.tudos.org/wp-content/uploads/2013/paper/ Schwarzkopf.pdf
  • 7. Google describes the business case… Taming Latency Variability Jeff Dean plus.google.com/u/0/+ResearchatGoogle/posts/C1dPhQhcDRv
  • 8. Mesos – definitions a common substrate for cluster computing mesos.apache.org heterogenous assets in your datacenter or cloud made available as a homogenous set of resources • • • • • • • • top-level Apache project scalability to 10,000s of nodes obviates the need for virtual machines isolation (pluggable) for CPU, RAM, I/O, FS, etc. fault-tolerant leader election based on ZooKeeper APIs in C++, Java, Python web UI for inspecting cluster state available for Linux, OpenSolaris, Mac OSX
  • 9. Mesos – architecture given the use of Mesos as a Datacenter OS kernel… • Chronos provides complex scheduling capabilities, much like a distributed Unix “cron” • Marathon provides highly-available long-running services, much like a distributed Unix “init.d” • next time you need to build a distributed app, consider using these as building blocks a major lesson learned from Spark: • leveraging these kinds of building blocks, one can rebuild Hadoop 100x faster, in much less code
  • 11. Mesos – datacenter OS stack HADOOP TELEMETRY STORM CHRONOS RAILS CAPACITY PLANNING SMARTER SCHEDULING MESOS SECURITY JBOSS GUI Apps OS Kernel
  • 12. Prior Practice: Dedicated Servers DATACENTER • • low utilization rates longer time to ramp up new services
  • 13. Prior Practice: Virtualization DATACENTER PROVISIONED VMS • • even more machines to manage • VM licensing costs substantial performance decrease due to virtualization
  • 14. Prior Practice: Static Partitioning DATACENTER STATIC PARTITIONING • • even more machines to manage • • VM licensing costs substantial performance decrease due to virtualization static partitioning limits elasticity
  • 15. Mesos: One Large Pool Of Resources DATACENTER MESOS “We wanted people to be able to program for the datacenter just like they program for their laptop." Ben Hindman
  • 16. Arguments for Datacenter Computing rather than running several specialized clusters, each at relatively low utilization rates, instead run many mixed workloads obvious benefits are realized in terms of: • • • scalability, elasticity, fault tolerance, performance, utilization reduced equipment cap­ex, Ops overhead, etc. reduced licensing, eliminating need for VMs or potential vendor lock­in subtle benefits – arguably, more important for Enterprise IT: • • reduced time for engineers to ramp­up new services at scale • enables Dev/Test apps to run safely on a Production cluster reduced latency between batch and services, enabling new high­ROI use cases
  • 17. What are the costs of Virtualization? benchmark type OpenVZ improvement mixed workloads 210%-300% LAMP (related) 38%-200% I/O throughput 200%-500% response time order magnitude more pronounced at higher loads
  • 18. What are the costs of Single Tenancy? MEMCACHED CPU LOAD RAILS CPU LOAD HADOOP CPU LOAD 100% 100% 100% 75% 75% 75% 50% 50% 50% 25% 25% 25% 0% 0% 0% t t COMBINED CPU LOAD (RAILS, MEMCACHED, HADOOP) 100% 75% 50% 25% 0% Hadoop Memcached Rails
  • 19. Example: Resource Offer in a Two-Level Scheduler mesos.apache.org/documentation/latest/mesos-architecture/
  • 20. Example: Docker on Mesos Mesos master servers Master Docker Registry Mesos slave servers S M S docker S … index.docker.io S Marathon can launch and monitor service containers from one or more Docker registries, using the Docker executor for Mesos M S docker S … marathon S M … … S S … docker … Local Docker Registry S S S … ( optional ) mesosphere.io/2013/09/26/docker-on-mesos/
  • 21. Example: Docker on Mesos Mesos Master Server init | + mesos-master | + marathon | The executor, monitored by the Mesos slave, delegates to the local Docker daemon for image discovery and management. The executor communicates with Marathon via the Mesos master and ensures that Docker enforces the specified resource limitations. Mesos Slave Server init | + docker | | | + lxc | | | + (user task, under container init system) | | | + mesos-slave | | | + /var/lib/mesos/executors/docker | | | | | + docker run … | | | mesosphere.io/2013/09/26/docker-on-mesos/
  • 22. Example: Docker on Mesos Mesos Master Server init | + mesos-master | + marathon | Mesos, LXC, and Docker are tied together for launch Mesos Slave Server 3 2 1 6 When a user requests a container… Docker Registry 7 5 init | + docker 8 | | | + lxc | | | + (user task, under container init system) | | | 4 + mesos-slave | | | + /var/lib/mesos/executors/docker | | | | | + docker run … | | | mesosphere.io/2013/09/26/docker-on-mesos/
  • 24. Opposite Ends of the Spectrum, One Common Substrate Solaris Zones Built-in / bare metal Linux CGroups Hypervisors
  • 25. Opposite Ends of the Spectrum, One Common Substrate Request / Response Batch
  • 26. Case Study: Twitter (bare metal / on premise) “Mesos is the cornerstone of our elastic compute infrastructure – it’s how we build all our new services and is critical for Twitter’s continued success at scale. It's one of the primary keys to our data center efficiency." Chris Fry, SVP Engineering blog.twitter.com/2013/mesos-graduates-from-apache-incubation • key services run in production: analytics, typeahead, ads • Twitter engineers rely on Mesos to build all new services • instead of thinking about static machines, engineers think about resources like CPU, memory and disk • allows services to scale and leverage a shared pool of servers across datacenters efficiently • reduces the time between prototyping and launching
  • 27. Case Study: Airbnb (fungible cloud infrastructure) “We think we might be pushing data science in the field of travel more so than anyone has ever done before… a smaller number of engineers can have higher impact through automation on Mesos." Mike Curtis,VP Engineering gigaom.com/2013/07/29/airbnb-is-engineering-itself-into-a-data... • improves resource management and efficiency • helps advance engineering strategy of building small teams that can move fast • key to letting engineers make the most of AWS-based infrastructure beyond just Hadoop • allowed company to migrate off Elastic MapReduce • enables use of Hadoop along with Chronos, Spark, Storm, etc.
  • 28. Media Coverage Mesosphere Adds Docker Support To Its Mesos-Based Operating System For The Data Center Frederic Lardinois TechCrunch (2013-09-26) techcrunch.com/2013/09/26/mesosphere... Play Framework Grid Deployment with Mesos James Ward, Flo Leibert, et al. Typesafe blog (2013-09-19) typesafe.com/blog/play-framework-grid... Mesosphere Launches Marathon Framework Adrian Bridgwater Dr. Dobbs (2013-09-18) drdobbs.com/open-source/mesosphere... New open source tech Marathon wants to make your data center run like Google’s Derrick Harris GigaOM (2013-09-04) gigaom.com/2013/09/04/new-open-source... Running batch and long-running, highly available service jobs on the same cluster Ben Lorica O’Reilly (2013-09-01) strata.oreilly.com/2013/09/running-batch...
  • 29. Resources Apache Mesos Project mesos.apache.org Mesosphere mesosphere.io Tutorials mesosphere.io/learn Documentation mesos.apache.org/documentation 2011 USENIX Research Paper usenix.org/legacy/event/nsdi11/tech/full_papers/ Hindman_new.pdf Collected Notes/Archives goo.gl/jPtTP
  • 30. Join us! Mesos Unconference @Twitter HQ, SF Nov 19 eventbrite.com/event/9104464699 Data Day Texas, Austin Jan 11 2014.datadaytexas.com (incl. Mesos talk+BOF) O’Reilly Strata, Santa Clara Feb 11-13 strataconf.com/strata2014 (incl. Mesos talk+BOF+workshop)
  • 31. Enterprise Data Workflows with Cascading O’Reilly, 2013 shop.oreilly.com/product/ 0636920028536.do monthly newsletter for updates, events, conference summaries, etc.: liber118.com/pxn/