SlideShare une entreprise Scribd logo
1  sur  32
数据中心网络研究:机遇与挑战

       郭传雄

   微软亚洲研究院 (MSRA)
      2011.04.15
                    1
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             2
3
4
Background: personal experience
• Bandwidth is a scarce resource
 Network   Memory        Disk     CPU                Year

 10Mb/s    2MB           10MB     386/20M            1994

 100Mb/s   128MB         2GB      PentiumII/233      1998

 100Mb/s   256MB         40GB     PentiumIII/800     2002

 1Gb/s     2GB           160GB    Core2/2GHZ         2007

 1Gb/s     4GB           500GB    Core2 Quad/3GHZ    2011

 X100      X2000, but    X50000   X150X4, but multi- 17 years
           slow access            core and instruction
                                  level progress

                                                                5
Background: technology trends
– Disk is cheap (TB and PB are common)
   • 500RMB for 1TB
– Memory is cheap (32GB a PC is not uncommon)
   • 150RMB for 2GB DRAM
– CPU is powerful yet inexpensive (multi-core)
   • 2000RMB for Intel core i7 with 4 cores
– But “network bandwidth is a scarce resource
   • Intra-DC: replication everywhere for fault tolerance
   • Inter-DC: Input and output need bandwidth
   • 50$ (per 1G port), 500$ (per 10G port)
– 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per
  month
                                                            6
DCN building blocks




Server   Rack   Container   Data Center   7
DCN reference design
              •   Does not scale
              •   Low bandwidth
              •   Single point of failure
              •   High cost




                                       8
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             9
Right time for DCN research
• It is a real problem
• It is an important problem
  – DCN as the infrastructure for cloud computing
• The assumptions are different
  – Data centers are owned by single organization
  – We can innovate at both end-hosts and network
    devices
  – Security is easier (closed environment and trusted
    people)

                                                     10
DCN research: opportunities
• Full of research problems
  – Scalability: tens of thousands to millions servers
  – Performance
  – Fault tolerance
  – Cost saving
  – Feel free to suggest new “TCP” protocols
• You can invent your own DCN!


                                                         11
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             12
Research challenges
Applications                       Architectures

•   Search                         •   Topology design
•   Distributed execution engine   •   Network virtualization
•   Distributed file systems       •   Electrical/optical switching
•   Online social networking       •   Commodity vs. special system
•   HPC applications



Technologies                       Protocols

• DCN management                   • DCN routing
• DCN platform                     • TCP incast congestion control
• Energy efficiency                • Multicast




                                                                      13
Architecture design
•   Scaling: from thousands to millions of servers
•   High capacity: support various traffic patterns
•   Fault tolerance
•   Cost efficient
•   Easy to deploy and manage




                                                      14
Fat-tree (ucsd-sigcomm08)




                            15
VL2 (msrr-sigcomm09)

               OSFP+ECMP


                           10G


                           10G

                           1G




                                16
Dcell/Bcube (msra-sigcomm08,09)

             • Put intelligence at servers
             • Use Ethernet switches as crossbar
             • Innovations in topology design and routing




  DCell                          BCube
                                                      17
Architecture: optical/electrical
switching (ucsd-sigcomm10, rice-
           sigcomm10)
                    • A hybrid architecture
                       • Optical circuit switching
                       • Electrical packet switching




                                              18
Protocols: TCP incast congestion
                 control

                   S1


                   S2
R



                   Sn


cmu-sigcomm09, msra-conext10


                                       19
Technologies: research platform
• A DCN research platform
  – High performance: comparable to ASIC
  – Easy to program: comparable to commodity server
  – Rich functions
     • Programmable packet forwarding
     • Experiment various control/management funcs
     • Can implement various routing/congestion control
       designs
• ServerSwitch (msra-nsdi11)
                                                          20
Applications
• A unified network for both data center and
  HPC applications?
                      Data center               HPC
Topology              Tree-based                Torus/mesh, fat-tree
Routing               Deterministic routing     Single path routing
                      Per-packet adaptive       L2 spanning tree
                      routing to exploit path   L3 shortest path routing
                      diversity
Flow control          No packet drop            Packets can be dropped
                      Hop by hop                End-to-end
Application support   Scientific applications   Search, e-commerce,
                                                cloud computing
Programming API       MPI/RDMA                  TCP/IP socket
                                                                           21
Outline
•   DCN background
•   Opportunities
•   Research challenges
•   A modular DCN design




                             22
Team
• Chuanxiong Guo, Guohan Lu, Haitao Wu,
  Yongqiang Xiong
• Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin
  Jia, Jun Li
• Alumni/Alumna
  – members: Songwu Lu, Dan Li
  – interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang,
    Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao
    Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce
    Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu…
                                                                23
Modular, mega-data center
      networking




                            24
Modular, mega-data center
        networking

BCube       BCube        BCube


BCube      MDCube        BCube


BCube       BCube        BCube
                                 25
BCube: Server centric network
BCube1


      <1,0>               <1,1>               <1,2>               <1,3>



BCube0
      <0,0>               <0,1>               <0,2>               <0,3>



 00   01   02   03   10   11   12   13   20   21   22   23   30   31   32        33




                                                                            26
2-D MDCube
             MDCube structure




                                27
Problem: Server for pkt fwding?
BCube1


      <1,0>                <1,1>               <1,2>               <1,3>



BCube0
      <0,0>                <0,1>               <0,2>               <0,3>



 00   01    02   03   10   11   12   13   20   21   22   23   30   31   32        33



                                      Forwarding node
                                                                             28
Solution: ServerSwitch

                   • Full programmability at server CPU
                      – Kernel module for low latency processing
Software




                      – User space for ease-to-use
                        programmability

                   • Low latency and high throughput
           PCI-E
                     interconnection
Hardware




                   • Packet forwarding in commodity
                     switching ASIC
                      – High performance and limited
                        programmability
                                                           29
Testbed
• A BCube testbed
  – 16 servers (Dell Precision 490 workstation with
    Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB
    disk)
  – 8 8-port mini-switches (DLink 8-port Gigabit
    switch DGS-1008D)
• NIC
  – Intel Pro/1000 PT quad-port Ethernet NIC
  – NetFPGA
                                                      30
Summary
• DCN is an area full of opportunities and
  challenges
• The best is yet to come!
• Further information
  • http://research.microsoft.com/en-
    us/projects/msradcn/default.aspx




                                             31
32

Contenu connexe

Tendances

Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
jsvetter
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...
Francesco Taurino
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
madhuinturi
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
Jacob Wu
 

Tendances (20)

User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud Datacenters
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
AI Chip Trends and Forecast
AI Chip Trends and ForecastAI Chip Trends and Forecast
AI Chip Trends and Forecast
 
Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttu
 
Cache-partitioning
Cache-partitioningCache-partitioning
Cache-partitioning
 
Ph.D. thesis presentation
Ph.D. thesis presentationPh.D. thesis presentation
Ph.D. thesis presentation
 
Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...Design and implementation of a reliable and cost-effective cloud computing in...
Design and implementation of a reliable and cost-effective cloud computing in...
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
 
Roeder posterismb2010
Roeder posterismb2010Roeder posterismb2010
Roeder posterismb2010
 
A Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change MemoryA Prototype Storage Subsystem based on Phase Change Memory
A Prototype Storage Subsystem based on Phase Change Memory
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Hyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolvedHyper v.nu-windows serverhyperv-networkingevolved
Hyper v.nu-windows serverhyperv-networkingevolved
 
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
Superfluid Orchestration of heterogeneous Reusable Functional Blocks for 5G n...
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Gluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFSGluster Webinar: Introduction to GlusterFS
Gluster Webinar: Introduction to GlusterFS
 

Similaire à 数据中心网络研究:机遇与挑战

Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptx
SandeepGupta229023
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
Linaro
 
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
Jose Antonio Coarasa Perez
 
Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure
Brad Eckert
 
PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services. PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services.
yeung2000
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
inside-BigData.com
 

Similaire à 数据中心网络研究:机遇与挑战 (20)

Lecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptxLecture notes - Data Centers________.pptx
Lecture notes - Data Centers________.pptx
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
 
Linaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updatedLinaro connect 2018 keynote final updated
Linaro connect 2018 keynote final updated
 
Resilient Network Design Concepts Educat
Resilient Network Design Concepts EducatResilient Network Design Concepts Educat
Resilient Network Design Concepts Educat
 
Vaibhav (2)
Vaibhav (2)Vaibhav (2)
Vaibhav (2)
 
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st CenturyThe von Neumann Memory Barrier and Computer Architectures for the 21st Century
The von Neumann Memory Barrier and Computer Architectures for the 21st Century
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Trends and challenges in IP based SOC design
Trends and challenges in IP based SOC designTrends and challenges in IP based SOC design
Trends and challenges in IP based SOC design
 
Solace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the ApplianceSolace Systems The Evolution of Messaging The Rise of the Appliance
Solace Systems The Evolution of Messaging The Rise of the Appliance
 
Extent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance MessagingExtent 2013 Obninsk High Performance Messaging
Extent 2013 Obninsk High Performance Messaging
 
Disruptive Technologies
Disruptive TechnologiesDisruptive Technologies
Disruptive Technologies
 
HiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentationHiPEAC-CSW 2022_Pedro Trancoso presentation
HiPEAC-CSW 2022_Pedro Trancoso presentation
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
The CMS Online Cluster: 
 Setup, Operation and Maintenance 
 of an Evolving C...
 
Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure Multicloud as the Next Generation of Cloud Infrastructure
Multicloud as the Next Generation of Cloud Infrastructure
 
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
PLNOG 8: Ivan Pepelnjak - Data Center Fabrics - What Really Matters
 
Navigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;salesNavigating dc architectures tech&amp;sales
Navigating dc architectures tech&amp;sales
 
PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services. PacketCloud: an Open Platform for Elastic In-network Services.
PacketCloud: an Open Platform for Elastic In-network Services.
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 

Dernier

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

数据中心网络研究:机遇与挑战

  • 1. 数据中心网络研究:机遇与挑战 郭传雄 微软亚洲研究院 (MSRA) 2011.04.15 1
  • 2. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 2
  • 3. 3
  • 4. 4
  • 5. Background: personal experience • Bandwidth is a scarce resource Network Memory Disk CPU Year 10Mb/s 2MB 10MB 386/20M 1994 100Mb/s 128MB 2GB PentiumII/233 1998 100Mb/s 256MB 40GB PentiumIII/800 2002 1Gb/s 2GB 160GB Core2/2GHZ 2007 1Gb/s 4GB 500GB Core2 Quad/3GHZ 2011 X100 X2000, but X50000 X150X4, but multi- 17 years slow access core and instruction level progress 5
  • 6. Background: technology trends – Disk is cheap (TB and PB are common) • 500RMB for 1TB – Memory is cheap (32GB a PC is not uncommon) • 150RMB for 2GB DRAM – CPU is powerful yet inexpensive (multi-core) • 2000RMB for Intel core i7 with 4 cores – But “network bandwidth is a scarce resource • Intra-DC: replication everywhere for fault tolerance • Inter-DC: Input and output need bandwidth • 50$ (per 1G port), 500$ (per 10G port) – 0.1$ = 1GB bandwidth = 1CPU hour = 1GB storage per month 6
  • 7. DCN building blocks Server Rack Container Data Center 7
  • 8. DCN reference design • Does not scale • Low bandwidth • Single point of failure • High cost 8
  • 9. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 9
  • 10. Right time for DCN research • It is a real problem • It is an important problem – DCN as the infrastructure for cloud computing • The assumptions are different – Data centers are owned by single organization – We can innovate at both end-hosts and network devices – Security is easier (closed environment and trusted people) 10
  • 11. DCN research: opportunities • Full of research problems – Scalability: tens of thousands to millions servers – Performance – Fault tolerance – Cost saving – Feel free to suggest new “TCP” protocols • You can invent your own DCN! 11
  • 12. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 12
  • 13. Research challenges Applications Architectures • Search • Topology design • Distributed execution engine • Network virtualization • Distributed file systems • Electrical/optical switching • Online social networking • Commodity vs. special system • HPC applications Technologies Protocols • DCN management • DCN routing • DCN platform • TCP incast congestion control • Energy efficiency • Multicast 13
  • 14. Architecture design • Scaling: from thousands to millions of servers • High capacity: support various traffic patterns • Fault tolerance • Cost efficient • Easy to deploy and manage 14
  • 16. VL2 (msrr-sigcomm09) OSFP+ECMP 10G 10G 1G 16
  • 17. Dcell/Bcube (msra-sigcomm08,09) • Put intelligence at servers • Use Ethernet switches as crossbar • Innovations in topology design and routing DCell BCube 17
  • 18. Architecture: optical/electrical switching (ucsd-sigcomm10, rice- sigcomm10) • A hybrid architecture • Optical circuit switching • Electrical packet switching 18
  • 19. Protocols: TCP incast congestion control S1 S2 R Sn cmu-sigcomm09, msra-conext10 19
  • 20. Technologies: research platform • A DCN research platform – High performance: comparable to ASIC – Easy to program: comparable to commodity server – Rich functions • Programmable packet forwarding • Experiment various control/management funcs • Can implement various routing/congestion control designs • ServerSwitch (msra-nsdi11) 20
  • 21. Applications • A unified network for both data center and HPC applications? Data center HPC Topology Tree-based Torus/mesh, fat-tree Routing Deterministic routing Single path routing Per-packet adaptive L2 spanning tree routing to exploit path L3 shortest path routing diversity Flow control No packet drop Packets can be dropped Hop by hop End-to-end Application support Scientific applications Search, e-commerce, cloud computing Programming API MPI/RDMA TCP/IP socket 21
  • 22. Outline • DCN background • Opportunities • Research challenges • A modular DCN design 22
  • 23. Team • Chuanxiong Guo, Guohan Lu, Haitao Wu, Yongqiang Xiong • Interns: Zhiqiang Zhou, Jiaxin Cao, Jiabo Ju, Qin Jia, Jun Li • Alumni/Alumna – members: Songwu Lu, Dan Li – interns: Lei Shi, Yunfeng Shi, Danfeng Zhang, Xuan Zhang, Byunchul Park, Nan Hua, Chen Tian, Min-Chen Zhao, Chao Kong, Kai Chen, Wenfei Wu, Shuang Yang, Peng Su, Bruce Chen, Zhenqian Feng, Min-Jeong Shi, Yibo Zhu… 23
  • 24. Modular, mega-data center networking 24
  • 25. Modular, mega-data center networking BCube BCube BCube BCube MDCube BCube BCube BCube BCube 25
  • 26. BCube: Server centric network BCube1 <1,0> <1,1> <1,2> <1,3> BCube0 <0,0> <0,1> <0,2> <0,3> 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 26
  • 27. 2-D MDCube MDCube structure 27
  • 28. Problem: Server for pkt fwding? BCube1 <1,0> <1,1> <1,2> <1,3> BCube0 <0,0> <0,1> <0,2> <0,3> 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 Forwarding node 28
  • 29. Solution: ServerSwitch • Full programmability at server CPU – Kernel module for low latency processing Software – User space for ease-to-use programmability • Low latency and high throughput PCI-E interconnection Hardware • Packet forwarding in commodity switching ASIC – High performance and limited programmability 29
  • 30. Testbed • A BCube testbed – 16 servers (Dell Precision 490 workstation with Intel 2.00GHz dualcore CPU, 4GB DRAM, 160GB disk) – 8 8-port mini-switches (DLink 8-port Gigabit switch DGS-1008D) • NIC – Intel Pro/1000 PT quad-port Ethernet NIC – NetFPGA 30
  • 31. Summary • DCN is an area full of opportunities and challenges • The best is yet to come! • Further information • http://research.microsoft.com/en- us/projects/msradcn/default.aspx 31
  • 32. 32