Data Center Operating System

•Télécharger en tant que PPTX, PDF•

0 j'aime•107 vues

Keshav Yadav

Ingénierie

What is DCOS ?
• Some have declared that “the datacenter is
the new computer”
• Claim: this new computer increasingly
needs an operating system
• Not necessarily a new host OS, but a
common software layer that manages
resources and provides shared services for
the whole datacenter, like an OS does for
one host

Why Datacenters Need an OS
• Growing number of applications
– Parallel processing systems: MapReduce,
Dryad, Pregel, Percolator, Dremel, MR Online
– Storage systems: GFS, BigTable, Dynamo,
SCADS
– Web apps and supporting services
• Growing number of users
– 200+ for Facebook’s Hadoop data
warehouse, running near-interactive ad hoc
queries

What Operating Systems Provide
• Resource sharing across applications &
users
• Data sharing between programs
• Programming abstractions (e.g. threads,
IPC)
• Debugging facilities (e.g. ptrace, gdb)
Result: OSes enable a highly interoperable
software ecosystem that we now take for
granted

Today’s Datacenter OS
• Hadoop MapReduce as common
execution and resource sharing platform
• Hadoop InputFormat API for data sharing
• Abstractions for productivity programmers,
but not for system builders
• Very challenging to debug across all the
layers

Tomorrow’s Datacenter OS
• Resource sharing:
– Lower-level interfaces for fine-grained sharing
(Mesos is a first step in this direction)
– Optimization for a variety of metrics (e.g.
energy)
– Integration with network scheduling
mechanisms (e.g. Seawall [NSDI ‘11], NOX,
Orchestra)

Tomorrow’s Datacenter OS
• Data sharing:
– Standard interfaces for cluster file systems,
key-value stores, etc
– In-memory data sharing (e.g. Spark, DFS
cache), and a unified system to manage this
memory
– Streaming data abstractions (analogous to
pipes)
– Lineage instead of replication for reliability
(RDDs)

Tomorrow’s Datacenter OS
• Programming abstractions:
– Tools that can be used to build the next
MapReduce / BigTable in a week (e.g.
BOOM)
– Efficient implementations of communication
primitives (e.g. shuffle, broadcast)
– New distributed programming models

Tomorrow’s Datacenter OS
• Debugging facilities:
– Tracing and debugging tools that work across
the cluster software stack (e.g. X-Trace,
Dapper)
– Replay debugging that takes advantage of
limited languages / computational models
– Unified monitoring infrastructure and APIs

Putting it AllTogether
• A successful datacenter OS might let
users:
– Build a Hadoop-like software stack in a week
using the OS’s abstractions, while gaining
other benefits (e.g. cross-stack replay
debugging)
– Share data efficiently between independently
developed programming models and
applications
– Understand cluster behavior without having to
log into individual nodes
– Dynamically share the cluster with other users

Future Of DCOS• Focus on paradigms, not only performance
– Industry is spending a lot of time on performance
• Explore clean-slate approaches
– Much datacenter software is written from scratch
– People using Erlang, Scala, functional models
(MR)
• Bring cluster computing to non-experts
– Most impactful (datacenter as the new
workstation)
– Hard to make a Google-scale stack usable
without a Google-scale ops team

Conclusion
• Datacenters need an OS-like software
stack for the same reasons single
computers did: manageability, efficiency &
programmability
• An OS is already emerging in an ad-hoc
way
• Researchers can help by taking a long-
term approach towards these problems

Contenu connexe

Tendances

Cloud computing and Hadoop introductionchristian.perez

Data lake-itweekend-sharif university-vahid amirydatastack

عصر کلان داده، چرا و چگونه؟datastack

Extending your Hadoop Implementation to the CloudDataWorks Summit

Hadoop mapreduce and yarn frame work- unit5RojaT4

RDBMS and Hadoopmanisha1110

Apache hadoop technology : BeginnersShweta Patnaik

Hw09 Rethinking The Data Warehouse With Hadoop And HiveCloudera, Inc.

P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.MaharajothiP

Big data architecture on cloud computing infrastructuredatastack

سکوهای ابری و مدل های برنامه نویسی در ابرdatastack

Cassandra data modelling best practicesSandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW

HADOOPHarinder Kaur

Hadoopavnishagr

HadoopOded Rotter

4. hadoop גיא לבנברגTaldor Group

Big data analysis using hadoop clusterFurqan Haider

PPT on HadoopShubham Parmar

Hadoop ArchitectureDr. C.V. Suresh Babu

Tendances (19)

Cloud computing and Hadoop introduction

Data lake-itweekend-sharif university-vahid amiry

عصر کلان داده، چرا و چگونه؟

Extending your Hadoop Implementation to the Cloud

Hadoop mapreduce and yarn frame work- unit5

RDBMS and Hadoop

Apache hadoop technology : Beginners

Hw09 Rethinking The Data Warehouse With Hadoop And Hive

P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.

Big data architecture on cloud computing infrastructure

سکوهای ابری و مدل های برنامه نویسی در ابر

Cassandra data modelling best practices

HADOOP

Hadoop

4. hadoop גיא לבנברג

Big data analysis using hadoop cluster

PPT on Hadoop

Hadoop Architecture

Similaire à Data Center Operating System

Cloud Services for Big Data AnalyticsGeoffrey Fox

Introduction to Apache HadoopChristopher Pezza

Lecture 3.31 3.32.pptxRATISHKUMAR32

Big Data & HadoopKrishna Sujeer

Map reducecloudtechJakir Hossain

Hadoop tutorialAamir Ameen

Hadoop Tutorial.pptSathish24111

Research on vector spatial data storage scheme basedAnant Kumar

Hadoop - Architectural road map for Hadoop Ecosystemnallagangus

Infinitely Scalable Clusters - Grid Computing on Public Cloud - LondonHentsū

Big Data and Cloud ComputingFarzad Nozarian

Module-2_HADOOP.pptxShreyasKv13

BIg Data Analytics-Module-2 vtu engineering.pptxVishalBH1

Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox

project--2 nd review_2Aswini Ashu

project--2 nd review_2aswini pilli

High Performance Computing and Big Data Geoffrey Fox

Developing Enterprise Consciousness: Building Modern Open Data PlatformsScyllaDB

Similaire à Data Center Operating System (20)

Cloud Services for Big Data Analytics

Introduction to Apache Hadoop

Lecture 3.31 3.32.pptx

Big Data & Hadoop

Map reducecloudtech

Hadoop tutorial

Hadoop Tutorial.ppt

Research on vector spatial data storage scheme based

Hadoop - Architectural road map for Hadoop Ecosystem

Infinitely Scalable Clusters - Grid Computing on Public Cloud - London

Big Data and Cloud Computing

Module-2_HADOOP.pptx

BIg Data Analytics-Module-2 vtu engineering.pptx

Matching Data Intensive Applications and Hardware/Software Architectures

project--2 nd review_2

High Performance Computing and Big Data

Developing Enterprise Consciousness: Building Modern Open Data Platforms

Dernier

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

result management system report for college projectTonystark477637

UNIT - IV - Air Compressors and its Performancesivaprakash250

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

AKTU Computer Networks notes --- Unit 3.pdfankushspencer015

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla

College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1

Extrusion Processes and Their Limitations120cr0395

Introduction to Multiple Access Protocol.pptxupamatechverse

Dernier (20)

The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts

result management system report for college project

UNIT - IV - Air Compressors and its Performance

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Water Industry Process Automation & Control Monthly - April 2024

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts

AKTU Computer Networks notes --- Unit 3.pdf

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

Extrusion Processes and Their Limitations

Introduction to Multiple Access Protocol.pptx

Data Center Operating System

1. Datacenter Operating System Presented By: Keshav Yadav www.linkedin.com/in/keshavyadavlpu

2. What is DCOS ? • Some have declared that “the datacenter is the new computer” • Claim: this new computer increasingly needs an operating system • Not necessarily a new host OS, but a common software layer that manages resources and provides shared services for the whole datacenter, like an OS does for one host

3. Why Datacenters Need an OS • Growing number of applications – Parallel processing systems: MapReduce, Dryad, Pregel, Percolator, Dremel, MR Online – Storage systems: GFS, BigTable, Dynamo, SCADS – Web apps and supporting services • Growing number of users – 200+ for Facebook’s Hadoop data warehouse, running near-interactive ad hoc queries

4. What Operating Systems Provide • Resource sharing across applications & users • Data sharing between programs • Programming abstractions (e.g. threads, IPC) • Debugging facilities (e.g. ptrace, gdb) Result: OSes enable a highly interoperable software ecosystem that we now take for granted

5. Today’s Datacenter OS • Hadoop MapReduce as common execution and resource sharing platform • Hadoop InputFormat API for data sharing • Abstractions for productivity programmers, but not for system builders • Very challenging to debug across all the layers

6. Tomorrow’s Datacenter OS • Resource sharing: – Lower-level interfaces for fine-grained sharing (Mesos is a first step in this direction) – Optimization for a variety of metrics (e.g. energy) – Integration with network scheduling mechanisms (e.g. Seawall [NSDI ‘11], NOX, Orchestra)

7. Tomorrow’s Datacenter OS • Data sharing: – Standard interfaces for cluster file systems, key-value stores, etc – In-memory data sharing (e.g. Spark, DFS cache), and a unified system to manage this memory – Streaming data abstractions (analogous to pipes) – Lineage instead of replication for reliability (RDDs)

8. Tomorrow’s Datacenter OS • Programming abstractions: – Tools that can be used to build the next MapReduce / BigTable in a week (e.g. BOOM) – Efficient implementations of communication primitives (e.g. shuffle, broadcast) – New distributed programming models

9. Tomorrow’s Datacenter OS • Debugging facilities: – Tracing and debugging tools that work across the cluster software stack (e.g. X-Trace, Dapper) – Replay debugging that takes advantage of limited languages / computational models – Unified monitoring infrastructure and APIs

10. Putting it AllTogether • A successful datacenter OS might let users: – Build a Hadoop-like software stack in a week using the OS’s abstractions, while gaining other benefits (e.g. cross-stack replay debugging) – Share data efficiently between independently developed programming models and applications – Understand cluster behavior without having to log into individual nodes – Dynamically share the cluster with other users

11. Future Of DCOS• Focus on paradigms, not only performance – Industry is spending a lot of time on performance • Explore clean-slate approaches – Much datacenter software is written from scratch – People using Erlang, Scala, functional models (MR) • Bring cluster computing to non-experts – Most impactful (datacenter as the new workstation) – Hard to make a Google-scale stack usable without a Google-scale ops team

12. Conclusion • Datacenters need an OS-like software stack for the same reasons single computers did: manageability, efficiency & programmability • An OS is already emerging in an ad-hoc way • Researchers can help by taking a long- term approach towards these problems

Notes de l'éditeur

Doesn’t have to be a host OS, but rather a software stack that performs the same functions as the host OS on a single computer
Point out that apps are developed independently and assume they have dedicated (slices of) machines
Go back to DC being the new computer
Mention lower level storage interfaces such as block store
Note about how it may be easier to have impact here than in a “real” OS

Data Center Operating System

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

Similaire à Data Center Operating System

Similaire à Data Center Operating System (20)

Dernier

Dernier (20)

Data Center Operating System

Notes de l'éditeur