SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Datacenter Computing
Trends and Problems :
      A survey


       Partha Kundu
   Sr. Distinguished Engineer
      Corporate CTO Office




       Special Session, May3
             NOCS 2011
        Pittsburgh, PA, USA
Data center computing is a new paradigm!

               Partha Kundu   Special Session NOCS 2011   2
Outline of talk

 Power & Energy in Data
 Centers

Network architecture

 Protocol interactions

 Conclusions

Partha Kundu   Special Session NOCS 2011   3
Power & Energy in
        the Data Center




Partha Kundu   Special Session NOCS 2011   4
Data Center Energy breakdown                      Server Peak power usage profile




 Source: ASHRAE                                             Source: Google 2007

• Power delivery and Cooling overheads
                                                 CPU power contribution is less than 1/3
are quantified in PUE metric
                                                 of server power
• Cooling is the most significant source of
energy inefficiency

                                  Partha Kundu       Special Session NOCS 2011             5
Energy Efficiency




Source : Barroso, Holzle: Data Center as a Computer, Morgan Claypool (publishers), 2009


Servers are never
 completely idle           Most of the time server           But, server is least energy
                           load is around 30%                efficient in it’s most common
                                                             operating region!


                                    Partha Kundu           Special Session NOCS 2011         6
Dynamic Power Range
                                              CPU power component (peak & idle) in
                                              servers has reduced over the years




Dynamic Power range:
• CPU power range is 3x for servers
• DRAM range is 2X
• Disk and Networking is < 1.2X

  Disk and Network switches need to
      learn from the CPU’s power                 Source : Barroso, Holzle: Data Center as a Computer, Morgan Claypool
         proportionality gains                   (publishers), 2009


                                      Partha Kundu               Special Session NOCS 2011                              7
Energy Proportionality
                                                 Goal:
                                                  Achieve best energy efficiency
                                                 (~80%) in the common operating
                                                 regions (20 – 30% load)




Challenges to proportionality:
• Most proportionality tricks in embedded/mobile devices are not useable in DC due to
huge activation penalties
• Distributed structure of data and application doesn’t allow powering down during low
use
• Disk drives spin >50% of time even when there is no activity
      [Sankar et al, ISCA ‘08] smaller rotational speeds, multiple heads
                                  Partha Kundu        Special Session NOCS 2011          8
Application Behavior in Data Centers
                                                           • Cosmos is similar to data
                                                           mining workload
                                                           • Bing preloads web
                                                           index in memory
                                                           • But, peak disk
                                                           bandwidth can be high



Source : Kozyrakis et al, IEEE Micro 2010


              Significant variation in disk, memory and network
                 capacity and bandwidth usage across Apps
                                            Partha Kundu   Special Session NOCS 2011     9
Dynamic Resource requirements
                                                   in the Data-center
                                   Intra-server variation (TPC-H, log scale)                      Inter-server variation (rendering farm)
                           100GB
Server Memory Allocation




                           10GB

                            1GB

                           100MB

                           10MB

                            1MB

                           0.1MB
                                   Q1    Q2   Q3   Q4   Q5   Q6   Q7     Q8    Q9     Q10   Q11   Q12                Time
                                                         Query

                                        Huge variations even within a single Application running in a
                                                               large cluster

                                                                       Partha Kundu                Special Session NOCS 2011            10
Motivating Disaggregated memory*
*Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009


                                     Conventional blade systems
                     DIMM                                                                       DIMM
                     DIMM                                                                       DIMM
                     DIMM             CPUs                                           CPUs       DIMM
                     DIMM                                                                       DIMM




                                                               Backplane
                     DIMM                                                                       DIMM
                     DIMM                                                                       DIMM
                     DIMM             CPUs                                           CPUs       DIMM
                     DIMM                                                                       DIMM




                                                    Partha Kundu                 Special Session NOCS 2011   11
Disaggregated Memory*
                      Blade systems with disaggregated memory

                           DIMM                                                                         DIMM
                           DIMM               CPUs                                          CPUs        DIMM




                                                                        Backplane
                           DIMM                                                                         DIMM
                           DIMM               CPUs                                          CPUs        DIMM


                    Leverage fast, shared                                                  DIMM DIMM
                    communication fabrics                                                  DIMM DIMM
                                                                                           DIMM DIMM
       Break CPU-memory co-location                                                       DIMM DIMM
*Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009   Memory blade
                                                            Partha Kundu                 Special Session NOCS 2011   12
Disaggregated Memory*
*Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009

                 Blade systems with disaggregated memory

                  DIMM           CPUs                                      CPUs          DIMM
                  DIMM                                                                   DIMM




                                                       Backplane
                  DIMM           CPUs                                      CPUs          DIMM
                  DIMM                                                                   DIMM

                                                                          DIMMDIMM
                                                                          DIMMDIMM
                                                                          DIMMDIMM
                                                                          DIMMDIMM
  Authors claim:                                                          Memory
   8X improvement on memory constrained                                  blade
  environments
   80+% improvement in performance per $
   3x consolidation

                                                    Partha Kundu                 Special Session NOCS 2011   13
Disaggregated Server
                                             Servers with Consolidated



                                                                  Power           Fabric
                                 DRAM           Disk drives
                                                                  supply       connectivity




                            High Density, Low Power SM10000 Servers*
                            • Designed to replace 40 1 RU servers in a single 10 RU system.
                            • 512 1.66 GHz 64 bit X86 Intel Atom cores in 10 RU; 2,048 CPUs/rack
                            • 1.28 Terabit interconnect fabric
                            • Up to 64 1 Gbps or 16 10 Gbps uplinks
 SeaMicro SM10000 server*   • 0-64 SATA SSD/Hard disk
                            • Integrated load balancing, Ethernet switching, and server
Claim:                      management
                            • Uses less than 2.5 KW of power
Achieves 4x Space &
Power consolidation
                             *Source : Seamicro URL http://www.seamicro.com/?q=node/102


                              Partha Kundu              Special Session NOCS 2011                  14
Network
        Architecture




Partha Kundu   Special Session NOCS 2011   15
Requirements of a Cloud-enabled
         Data Center
                               Economic & Technical Motivations:

                                Use commodity hardware &
                               components

                                Dynamically distribute compute
                               resources




 Capacity re-                           Economies
  allocation                             of Scale

                Partha Kundu       Special Session NOCS 2011       16
Status Quo: Conventional DC Network
 Internet
                                       CR               CR
DC-Layer 3
                              AR     AR
                                                  ...    AR          AR
DC-Layer 2
                              S       S                                     Key
                                                               •   CR = Core Router (L3)
                                                               •   AR = Access Router (L3)
                   S          S       S            S
                                                        ...    •   S = Ethernet Switch (L2)
                                                               •   A = Rack of app. servers
                          …                   …
                 ~ 1,000 servers/pod == IP subnet

            Ref: “Data Center: Load balancing Data Center Services”, Cisco 2004

                                   Partha Kundu          Special Session NOCS 2011            17
Conventional DC Network Problems
                       CR                            CR

                                    ~ 200:1
             AR        AR                                AR       AR


             S         S                                  S        S

                  ~ 40:1
                                              S           S        S          S
     S    ~S
           5:1         S        S
                                       ...
         …                  …                        …                    …



• Cost of network equipment is prohibitive
• Limited server-to-server capacity
                            Partha Kundu          Special Session NOCS 2011       18
And More Problems …
                      CR                                CR

                                       ~ 200:1
              AR     AR                                     AR       AR


              S       S                                      S        S


     S        S      S            S              S           S        S          S



          …                …                            …                    …

         IP subnet (VLAN) #1                           IP subnet (VLAN) #2


• Resource fragmentation, significantly lowering cloud
  utilization (and cost-efficiency)
                               Partha Kundu          Special Session NOCS 2011       19
And More Problems …
                      CR                                CR

                                       ~ 200:1
              AR     AR                                     AR       AR
                            Complicated manual
              S       S    L2/L3 re-configuration            S        S


     S        S      S            S              S           S        S          S



          …                 …                           …                    …

         IP subnet (VLAN) #1                           IP subnet (VLAN) #2


• Server IP address assignments are topological
• IP movement from contained VLAN is hard
                               Partha Kundu          Special Session NOCS 2011       20
What We Need is…..


            1. L2 semantics


2. Uniform High            3. Performance
    capacity                   isolation



…           …                    …                    …




            Partha Kundu      Special Session NOCS 2011   21
Achieve Uniform High Capacity :
            Clos Network Topology*
*Ref: A Scalable, Commodity, Data Center architecture, Al-Fares et al, SIGCOMM 2008

                                          Int
                                                               ..
                                                               .
                                 Aggr
                                                              ..
                                                     K aggr switches with D ports
                                                              .
                                 ..      TOR              .....
                                 .                        .
                            ..           20
                                       Servers
                                                            .......
                                                     20*(DK/4)
                            .                               .
                                                     Servers
                                 • Large bisection BW
                            • Multi paths at modest cost
                              • Tolerates Fabric Failure
                                           Partha Kundu             Special Session NOCS 2011   22
Addressing and Routing:
                 Name-Location Separation
                    Switches run link-state routing and                            Directory
                    maintain only switch-level topology                             Service
                                                                                          …
                                                                                       x  ToR2
                                                                                       y  ToR3
                                                                                       z  ToR4
        ToR1 . . .       ToR2       ...         ToR3     ...    ToR4                      …



    ToR3 y payload                                                                 Lookup &
                           x                         y            z                Response
    ToR3 z payload
       4



                          Servers use flat names
*VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009


                                      Partha Kundu         Special Session NOCS 2011              23
Addressing and Routing:
                 Name-Location Separation

                    Switches run link-state routing and                             Directory
                    maintain only switch-level topology                              Service
                                                                                           …
                                                                                        x  ToR2
                                                                                        y  ToR3
                                                                                        z  ToR4
                                                                                               3
        ToR1 . . .       ToR2       ...         ToR3      ...    ToR4                      …



    ToR3 y payload                                                                  Lookup &
                           x                         yz                             Response
    ToR3 z payload
       4



                          Servers use flat names
*VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009

                                      Partha Kundu          Special Session NOCS 2011              24
VL2 Fabric
               Objectives and Solutions
    Objective             Approach                      Solution
                                                   Name-location
1. Layer-2               Employ flat
                                                    separation &
   semantics             addressing
                                                 resolution service

2. Uniform              Guarantee               Clos based network,
   high capacity       bandwidth for               Valiant LB flow
   between servers   hose-model traffic                routing

                     Enforce hose model
3. Performance
                        using existing                      TCP
   Isolation
                      mechanisms only

                      Partha Kundu     Special Session NOCS 2011      25
Protocol
        Interactions




Partha Kundu   Special Session NOCS 2011   26
TCP InCast Collapse : Problem




                                                     Source : Nagle et al, The Panasas ActiveScale Storage
                                                     Cluster – Delivering Scalable High Bandwidth Storage,
                                                     SC2004

Affects key datacenter applications with barrier synchronization boundaries
e.g. DFS, web search, MapReduce

                                   Partha Kundu         Special Session NOCS 2011                            27
Partha Kundu   Special Session NOCS 2011   28
New Cluster Based Storage System




            Partha Kundu   Special Session NOCS 2011   29
Incast Application overfills Buffers




             Partha Kundu   Special Session NOCS 2011   30
Solution: TCP with ms-RTO*
*Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication, Vasudevan et al,
SIGCOMM 2009




 • Little adverse effect on WAN traffic


                                            Partha Kundu             Special Session NOCS 2011        31
Incast Collapse :
          an unsolved problem at scale*
*Understanding TCP Incast Throughput Collapse in Datacenter Networks, Griffith et al WREN 2009




    Solution space is complex:
    • Network conditions can impact RTT
    • Switch buffer management strategies
    • Goodput can be unstable with load/num. senders

                                               Partha Kundu                  Special Session NOCS 2011   32
Conclusions




Partha Kundu   Special Session NOCS 2011   33
Data Center Computing

• Opportunities to realize energy efficiency
  particularly in IO sub-systems

• Data Center fabrics need to be re-architected
  for application scalability and cost

• WAN artifacts can create bottlenecks


                  Partha Kundu   Special Session NOCS 2011   34
NOCs in the Data Center
• Energy Efficiency:
  Local (distributed) energy management decision
  & coordination by NOC

• Fabric communication:
  NOC can reduce intra-chip/socket communication
  latencies between VMs

• Congestion Mgt:
  NOC can assist in traffic orchestration across VMs

                    Partha Kundu   Special Session NOCS 2011   35
Thank you!




 Partha Kundu   Special Session NOCS 2011   36

Contenu connexe

Tendances

SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloudaidanshribman
 
ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisShaheryar Iqbal
 
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1IBM India Smarter Computing
 
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Shaheryar Iqbal
 
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceInfrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceWalter Moriconi
 
Diamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheetDiamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheetceed2013
 
Dataman Virtual Desktops Solution
Dataman Virtual Desktops SolutionDataman Virtual Desktops Solution
Dataman Virtual Desktops Solutionjorgenc11
 
Introduction to Intelligent Power Management
Introduction to Intelligent Power ManagementIntroduction to Intelligent Power Management
Introduction to Intelligent Power ManagementAdaptec by PMC
 
Advertising System Upgrade
Advertising System UpgradeAdvertising System Upgrade
Advertising System Upgradeandrew maybir
 
AMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD
 
Lug best practice_hpc_workflow
Lug best practice_hpc_workflowLug best practice_hpc_workflow
Lug best practice_hpc_workflowrjmurphyslideshare
 
Power of Storage FREE Webinar
Power of Storage FREE WebinarPower of Storage FREE Webinar
Power of Storage FREE WebinarLidia Gasparotto
 
Optimize your infrastructure with IBM Virtualization Solutions
Optimize your infrastructure with IBM Virtualization SolutionsOptimize your infrastructure with IBM Virtualization Solutions
Optimize your infrastructure with IBM Virtualization SolutionsIBM India Smarter Computing
 
9sept2009 concept electronics
9sept2009 concept electronics9sept2009 concept electronics
9sept2009 concept electronicsAgora Group
 
Algorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of MagnitudeAlgorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of Magnitudechiportal
 
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...Michael Hudak
 
30ian2009 fujitsu
30ian2009 fujitsu30ian2009 fujitsu
30ian2009 fujitsuAgora Group
 

Tendances (20)

SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloud
 
ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance Analysis
 
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
Linux on System z Optimizing Resource Utilization for Linux under z/VM - Part1
 
EMC - 8sept2011
EMC - 8sept2011EMC - 8sept2011
EMC - 8sept2011
 
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
 
Performance in a virtualized environment
Performance in a virtualized environmentPerformance in a virtualized environment
Performance in a virtualized environment
 
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceInfrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
 
Diamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheetDiamondmax plus 8_data_sheet
Diamondmax plus 8_data_sheet
 
Dataman Virtual Desktops Solution
Dataman Virtual Desktops SolutionDataman Virtual Desktops Solution
Dataman Virtual Desktops Solution
 
Introduction to Intelligent Power Management
Introduction to Intelligent Power ManagementIntroduction to Intelligent Power Management
Introduction to Intelligent Power Management
 
Advertising System Upgrade
Advertising System UpgradeAdvertising System Upgrade
Advertising System Upgrade
 
AMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press Presentation
 
Lug best practice_hpc_workflow
Lug best practice_hpc_workflowLug best practice_hpc_workflow
Lug best practice_hpc_workflow
 
Power of Storage FREE Webinar
Power of Storage FREE WebinarPower of Storage FREE Webinar
Power of Storage FREE Webinar
 
IBM System x3630 M3 Product Guide
IBM System x3630 M3 Product GuideIBM System x3630 M3 Product Guide
IBM System x3630 M3 Product Guide
 
Optimize your infrastructure with IBM Virtualization Solutions
Optimize your infrastructure with IBM Virtualization SolutionsOptimize your infrastructure with IBM Virtualization Solutions
Optimize your infrastructure with IBM Virtualization Solutions
 
9sept2009 concept electronics
9sept2009 concept electronics9sept2009 concept electronics
9sept2009 concept electronics
 
Algorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of MagnitudeAlgorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of Magnitude
 
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
 
30ian2009 fujitsu
30ian2009 fujitsu30ian2009 fujitsu
30ian2009 fujitsu
 

Similaire à Data center computing trends a survey

Racs2012 djshin
Racs2012 djshinRacs2012 djshin
Racs2012 djshin동재 신
 
Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessingchiportal
 
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Fisnik Kraja
 
Reimagining HPC Compute and Storage Architecture with Intel Optane Technology
Reimagining HPC Compute and Storage Architecture with Intel Optane TechnologyReimagining HPC Compute and Storage Architecture with Intel Optane Technology
Reimagining HPC Compute and Storage Architecture with Intel Optane Technologyinside-BigData.com
 
NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009Randall Hand
 
2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito ctaLorenzo Corbetta
 
Cots moves to multicore: Wind River
Cots moves to multicore: Wind RiverCots moves to multicore: Wind River
Cots moves to multicore: Wind RiverKonrad Witte
 
Application acceleration from the data storage perspective
Application acceleration from the data storage perspectiveApplication acceleration from the data storage perspective
Application acceleration from the data storage perspectiveInterop
 
Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3Tony Pearson
 
Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3Tony Pearson
 
An energy, memory, and performance analysis
An energy, memory, and performance analysisAn energy, memory, and performance analysis
An energy, memory, and performance analysisElisabeth Stahl
 
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012Benoit Hudzia
 
Multi core processors
Multi core processorsMulti core processors
Multi core processorsNipun Sharma
 
Lessons Learned: AMD’S Private Cloud
Lessons Learned: AMD’S Private CloudLessons Learned: AMD’S Private Cloud
Lessons Learned: AMD’S Private Cloudamdcomputex
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Netronome
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...Agora Group
 
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAlluxio, Inc.
 
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMSCOMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMSijcsit
 
What is the future of disk drives?
What is the future of disk drives?What is the future of disk drives?
What is the future of disk drives?Iftikhar Alam
 

Similaire à Data center computing trends a survey (20)

Racs2012 djshin
Racs2012 djshinRacs2012 djshin
Racs2012 djshin
 
Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessing
 
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
 
Reimagining HPC Compute and Storage Architecture with Intel Optane Technology
Reimagining HPC Compute and Storage Architecture with Intel Optane TechnologyReimagining HPC Compute and Storage Architecture with Intel Optane Technology
Reimagining HPC Compute and Storage Architecture with Intel Optane Technology
 
NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009
 
2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta
 
Cots moves to multicore: Wind River
Cots moves to multicore: Wind RiverCots moves to multicore: Wind River
Cots moves to multicore: Wind River
 
Application acceleration from the data storage perspective
Application acceleration from the data storage perspectiveApplication acceleration from the data storage perspective
Application acceleration from the data storage perspective
 
Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3
 
Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3Infoboom future-storage-aug2011-v3
Infoboom future-storage-aug2011-v3
 
An energy, memory, and performance analysis
An energy, memory, and performance analysisAn energy, memory, and performance analysis
An energy, memory, and performance analysis
 
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
 
Multi core processors
Multi core processorsMulti core processors
Multi core processors
 
Lessons Learned: AMD’S Private Cloud
Lessons Learned: AMD’S Private CloudLessons Learned: AMD’S Private Cloud
Lessons Learned: AMD’S Private Cloud
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
 
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and StorageAccelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
 
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMSCOMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
COMPARATIVE ANALYSIS OF SINGLE-CORE AND MULTI-CORE SYSTEMS
 
What is the future of disk drives?
What is the future of disk drives?What is the future of disk drives?
What is the future of disk drives?
 

Dernier

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Dernier (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Data center computing trends a survey

  • 1. Datacenter Computing Trends and Problems : A survey Partha Kundu Sr. Distinguished Engineer Corporate CTO Office Special Session, May3 NOCS 2011 Pittsburgh, PA, USA
  • 2. Data center computing is a new paradigm! Partha Kundu Special Session NOCS 2011 2
  • 3. Outline of talk  Power & Energy in Data Centers Network architecture  Protocol interactions  Conclusions Partha Kundu Special Session NOCS 2011 3
  • 4. Power & Energy in the Data Center Partha Kundu Special Session NOCS 2011 4
  • 5. Data Center Energy breakdown Server Peak power usage profile Source: ASHRAE Source: Google 2007 • Power delivery and Cooling overheads CPU power contribution is less than 1/3 are quantified in PUE metric of server power • Cooling is the most significant source of energy inefficiency Partha Kundu Special Session NOCS 2011 5
  • 6. Energy Efficiency Source : Barroso, Holzle: Data Center as a Computer, Morgan Claypool (publishers), 2009 Servers are never completely idle Most of the time server But, server is least energy load is around 30% efficient in it’s most common operating region! Partha Kundu Special Session NOCS 2011 6
  • 7. Dynamic Power Range CPU power component (peak & idle) in servers has reduced over the years Dynamic Power range: • CPU power range is 3x for servers • DRAM range is 2X • Disk and Networking is < 1.2X Disk and Network switches need to learn from the CPU’s power Source : Barroso, Holzle: Data Center as a Computer, Morgan Claypool proportionality gains (publishers), 2009 Partha Kundu Special Session NOCS 2011 7
  • 8. Energy Proportionality Goal: Achieve best energy efficiency (~80%) in the common operating regions (20 – 30% load) Challenges to proportionality: • Most proportionality tricks in embedded/mobile devices are not useable in DC due to huge activation penalties • Distributed structure of data and application doesn’t allow powering down during low use • Disk drives spin >50% of time even when there is no activity  [Sankar et al, ISCA ‘08] smaller rotational speeds, multiple heads Partha Kundu Special Session NOCS 2011 8
  • 9. Application Behavior in Data Centers • Cosmos is similar to data mining workload • Bing preloads web index in memory • But, peak disk bandwidth can be high Source : Kozyrakis et al, IEEE Micro 2010 Significant variation in disk, memory and network capacity and bandwidth usage across Apps Partha Kundu Special Session NOCS 2011 9
  • 10. Dynamic Resource requirements in the Data-center Intra-server variation (TPC-H, log scale) Inter-server variation (rendering farm) 100GB Server Memory Allocation 10GB 1GB 100MB 10MB 1MB 0.1MB Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Time Query Huge variations even within a single Application running in a large cluster Partha Kundu Special Session NOCS 2011 10
  • 11. Motivating Disaggregated memory* *Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009 Conventional blade systems DIMM DIMM DIMM DIMM DIMM CPUs CPUs DIMM DIMM DIMM Backplane DIMM DIMM DIMM DIMM DIMM CPUs CPUs DIMM DIMM DIMM Partha Kundu Special Session NOCS 2011 11
  • 12. Disaggregated Memory* Blade systems with disaggregated memory DIMM DIMM DIMM CPUs CPUs DIMM Backplane DIMM DIMM DIMM CPUs CPUs DIMM Leverage fast, shared DIMM DIMM communication fabrics DIMM DIMM DIMM DIMM  Break CPU-memory co-location DIMM DIMM *Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009 Memory blade Partha Kundu Special Session NOCS 2011 12
  • 13. Disaggregated Memory* *Lim et al: Disaggregated Memory for expansion and sharing in Blade Servers, ISCA 2009 Blade systems with disaggregated memory DIMM CPUs CPUs DIMM DIMM DIMM Backplane DIMM CPUs CPUs DIMM DIMM DIMM DIMMDIMM DIMMDIMM DIMMDIMM DIMMDIMM Authors claim: Memory  8X improvement on memory constrained blade environments  80+% improvement in performance per $  3x consolidation Partha Kundu Special Session NOCS 2011 13
  • 14. Disaggregated Server Servers with Consolidated Power Fabric DRAM Disk drives supply connectivity High Density, Low Power SM10000 Servers* • Designed to replace 40 1 RU servers in a single 10 RU system. • 512 1.66 GHz 64 bit X86 Intel Atom cores in 10 RU; 2,048 CPUs/rack • 1.28 Terabit interconnect fabric • Up to 64 1 Gbps or 16 10 Gbps uplinks SeaMicro SM10000 server* • 0-64 SATA SSD/Hard disk • Integrated load balancing, Ethernet switching, and server Claim: management • Uses less than 2.5 KW of power Achieves 4x Space & Power consolidation *Source : Seamicro URL http://www.seamicro.com/?q=node/102 Partha Kundu Special Session NOCS 2011 14
  • 15. Network Architecture Partha Kundu Special Session NOCS 2011 15
  • 16. Requirements of a Cloud-enabled Data Center Economic & Technical Motivations:  Use commodity hardware & components  Dynamically distribute compute resources Capacity re- Economies allocation of Scale Partha Kundu Special Session NOCS 2011 16
  • 17. Status Quo: Conventional DC Network Internet CR CR DC-Layer 3 AR AR ... AR AR DC-Layer 2 S S Key • CR = Core Router (L3) • AR = Access Router (L3) S S S S ... • S = Ethernet Switch (L2) • A = Rack of app. servers … … ~ 1,000 servers/pod == IP subnet Ref: “Data Center: Load balancing Data Center Services”, Cisco 2004 Partha Kundu Special Session NOCS 2011 17
  • 18. Conventional DC Network Problems CR CR ~ 200:1 AR AR AR AR S S S S ~ 40:1 S S S S S ~S 5:1 S S ... … … … … • Cost of network equipment is prohibitive • Limited server-to-server capacity Partha Kundu Special Session NOCS 2011 18
  • 19. And More Problems … CR CR ~ 200:1 AR AR AR AR S S S S S S S S S S S S … … … … IP subnet (VLAN) #1 IP subnet (VLAN) #2 • Resource fragmentation, significantly lowering cloud utilization (and cost-efficiency) Partha Kundu Special Session NOCS 2011 19
  • 20. And More Problems … CR CR ~ 200:1 AR AR AR AR Complicated manual S S L2/L3 re-configuration S S S S S S S S S S … … … … IP subnet (VLAN) #1 IP subnet (VLAN) #2 • Server IP address assignments are topological • IP movement from contained VLAN is hard Partha Kundu Special Session NOCS 2011 20
  • 21. What We Need is….. 1. L2 semantics 2. Uniform High 3. Performance capacity isolation … … … … Partha Kundu Special Session NOCS 2011 21
  • 22. Achieve Uniform High Capacity : Clos Network Topology* *Ref: A Scalable, Commodity, Data Center architecture, Al-Fares et al, SIGCOMM 2008 Int .. . Aggr .. K aggr switches with D ports . .. TOR ..... . . .. 20 Servers ....... 20*(DK/4) . . Servers • Large bisection BW • Multi paths at modest cost • Tolerates Fabric Failure Partha Kundu Special Session NOCS 2011 22
  • 23. Addressing and Routing: Name-Location Separation Switches run link-state routing and Directory maintain only switch-level topology Service … x  ToR2 y  ToR3 z  ToR4 ToR1 . . . ToR2 ... ToR3 ... ToR4 … ToR3 y payload Lookup & x y z Response ToR3 z payload 4 Servers use flat names *VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009 Partha Kundu Special Session NOCS 2011 23
  • 24. Addressing and Routing: Name-Location Separation Switches run link-state routing and Directory maintain only switch-level topology Service … x  ToR2 y  ToR3 z  ToR4 3 ToR1 . . . ToR2 ... ToR3 ... ToR4 … ToR3 y payload Lookup & x yz Response ToR3 z payload 4 Servers use flat names *VL2: A Scalable and Flexible Data Center Network, Greenberg et al, SIGCOMM 2009 Partha Kundu Special Session NOCS 2011 24
  • 25. VL2 Fabric Objectives and Solutions Objective Approach Solution Name-location 1. Layer-2 Employ flat separation & semantics addressing resolution service 2. Uniform Guarantee Clos based network, high capacity bandwidth for Valiant LB flow between servers hose-model traffic routing Enforce hose model 3. Performance using existing TCP Isolation mechanisms only Partha Kundu Special Session NOCS 2011 25
  • 26. Protocol Interactions Partha Kundu Special Session NOCS 2011 26
  • 27. TCP InCast Collapse : Problem Source : Nagle et al, The Panasas ActiveScale Storage Cluster – Delivering Scalable High Bandwidth Storage, SC2004 Affects key datacenter applications with barrier synchronization boundaries e.g. DFS, web search, MapReduce Partha Kundu Special Session NOCS 2011 27
  • 28. Partha Kundu Special Session NOCS 2011 28
  • 29. New Cluster Based Storage System Partha Kundu Special Session NOCS 2011 29
  • 30. Incast Application overfills Buffers Partha Kundu Special Session NOCS 2011 30
  • 31. Solution: TCP with ms-RTO* *Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication, Vasudevan et al, SIGCOMM 2009 • Little adverse effect on WAN traffic Partha Kundu Special Session NOCS 2011 31
  • 32. Incast Collapse : an unsolved problem at scale* *Understanding TCP Incast Throughput Collapse in Datacenter Networks, Griffith et al WREN 2009 Solution space is complex: • Network conditions can impact RTT • Switch buffer management strategies • Goodput can be unstable with load/num. senders Partha Kundu Special Session NOCS 2011 32
  • 33. Conclusions Partha Kundu Special Session NOCS 2011 33
  • 34. Data Center Computing • Opportunities to realize energy efficiency particularly in IO sub-systems • Data Center fabrics need to be re-architected for application scalability and cost • WAN artifacts can create bottlenecks Partha Kundu Special Session NOCS 2011 34
  • 35. NOCs in the Data Center • Energy Efficiency: Local (distributed) energy management decision & coordination by NOC • Fabric communication: NOC can reduce intra-chip/socket communication latencies between VMs • Congestion Mgt: NOC can assist in traffic orchestration across VMs Partha Kundu Special Session NOCS 2011 35
  • 36. Thank you! Partha Kundu Special Session NOCS 2011 36