SlideShare une entreprise Scribd logo
1  sur  42
DevoFlow: Scaling Flow
Management for High-Performance
          Networks
   Andrew R. Curtis (University of Waterloo); Jeffrey C.
  Mogul, Jean Tourrilhes, Praveen Yalagandula, Puneet
  Sharma, Sujata Banerjee (HP Labs), SIGCOMM 2011




  Presenter: Jason, Tsung-Cheng, HOU
  Advisor: Wanjiun Liao
                                         Mar. 22nd, 2012   1
Motivation
• SDN / OpenFlow can enable per-flow
  management… However…
• What are the costs and limitations?
• Network-wide logical graph
  = always collecting all flows’ stat.s?
• Any more problems beyond controller’s
  scalability?
• Enhancing performance / scalability of
  controllers solves all problems?
                                           2
DevoFlow Contributions
• Characterize overheads of implementing
  OpenFlow on switches
• Evaluate flow mgmt capability within data
  center network environment
• Propose DevoFlow to enable scalable flow
  mgmt by balancing
  – Network control
  – Statistics collection
  – Overheads
  – Switch functions and controller loads
                                              3
Agenda
•   OF Benefits, Bottlenecks, and Dilemmas
•   Evaluation of Overheads
•   DevoFlow
•   Simulation Results




                                             4
Benefits
•   Flexible policies w/o switch-by-sw config.
•   Network graph and visibility, stat.s collection
•   Enable traffic engineering and network mgmt
•   OpenFlow switches are relatively simple
•   Accelerate innovation:
    – VL2, PortLand: new architecture, virtualized addr
    – Hedera: flow scheduling
    – ElsticTree: energy-proportional networking
• However, no further est. of overheads
                                                     5
Bottlenecks
• Root: Excessively couples..
  – central control and complete visibility
• Controller bottleneck: scale by dist. sys.
• Switch bottleneck:
  – Data- to control-plane: limited BW
  – Enormous flow tables, too many entries
  – Control and stat.s pkts compete for BW
  – Introduce extra delays and latencies
• Switch bottleneck was not well studied
                                               6
Dilemma
• Control dilemma:
  – Role of controller: visibility and mgmt capability
    however, per-flow setup too costly
  – Flow-match wildcard, hash-based:
    much less load, but no effective control
• Statistics-gathering dilemma:
  – Pull-based mechanism: counters of all flows
    full visibility but demand high BW
  – Wildcard counter aggregation: much less entries
    but lose trace of elephant flows
• Aim to strike in between                               7
Main Concept of DevoFlow
•   Devolving most flow controls to switches
•   Maintain partial visibility
•   Keep trace of significant flows
•   Default v.s. special actions:
    – Security-sensitive flows: categorically inspect
    – Normal flows: may evolve or cover other flows
      become security-sensitive or significant
    – Significant flows: special attention
• Collect stat.s by sampling, triggering, and
  approximating
                                                        8
Design Principles of DevoFlow
• Try to stay in data-plane, by default
• Provide enough visibility:
  – Esp. for significant flows & sec-sensitive flows
  – Otherwise, aggregate or approximate stat.s
• Maintain simplicity of switches




                                                       9
Agenda
•   OF Benefits, Bottlenecks, and Dilemmas
•   Evaluation of Overheads
•   DevoFlow
•   Simulation Results




                                             10
Overheads: Control PKTs
                               A N-switch path



For a path with N switches: N+1 control pkts
• First flow pkt to controller
• N control messages to N switches
Average length of a flow in 1997: 20 pkts
In clos / fat-tree DCN topo: 5 switches
   6 control pkts per flow
   The smaller the flow, the higher cost of BW
                                                 11
Overheads: Flow Setup
• Switch w/ finite BW between data / control
  plane, i.e. overheads between ASIC and CPU
• Setup capability: 275~300 flows/sec
• Similar with [30]
• In data center: mean interarrival 30 ms
• Rack w/ 40 servers     1300 flows/sec
• In whole data center

      [43] R. Sherwood, G. Gibb, K.-K. Yap, G. Appenzeller, M. Casado,
      N. McKeown, and G. Parulkar. Can the production network be
      the testbed? In OSDI , 2010.                                       12
Overheads: Flow Setup
Experiment: a single switch




                              13
Overheads: Flow Setup

ASIC switching rate
Latency: 5 s




                                   14
Overheads: Flow Setup

ASIC CPU
Latency: 0.5 ms




                                  15
Overheads: Flow Setup
CPU Controller
Latency: 2 ms
A huge waste
of resources!




                                16
Overheads: Gathering Stat.s
•   [30] most longest-lived flows: only a few sec
•   Counters: (pkts, bytes, duration)
•   Push-based: to controller when flow ends
•   Pull-based: fetch actively by controller
•   88F bytes for F flows
•   In 5406zl switch:
    Entries:1.5K wildcard match/13K exact match
       total 1.3 MB, 2 fetches/sec, 17 Mbps
       Not fast enough! Consumes a lot of BW!
         [30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The
         Nature of Datacenter Trac: Measurements & Analysis. In
         Proc. IMC , 2009.                                               17
Overheads: Gathering Stat.s

 2.5 sec to pull 13K entries
 1 sec to pull 5,600 entries
 0.5 sec to pull 3,200 entries




                                 18
Overheads: Gathering Stat.s
• Per-flow setup generates too many entries
• More the controller fetch    longer
• Longer to fetch    longer the control loop
• In Hedera: control loop 5 secs
  BUT workload too ideal, Pareto distribution
• Workload in VL2, 5 sec only improves 1~5%
  over ECMP
• [41], must be less than 0.5 sec to be better
       [41] C. Raiciu, C. Pluntke, S. Barre, A. Greenhalgh, D. Wischik,
       and M. Handley. Data center networking with multipath TCP.
       In HotNets , 2010.                                                 19
Overheads: Competition
• Flow setups and stat-pulling compete for BW
• Must need timely stat.s for scheduling
• Switch flow entries
  – OpenFlow: TCAMs, wildcard, consumes lots of
    power & space
  – Rules: 10 header fields, 288 bits each
  – Only 60 bits for trad. Ethernet
• Per-flow entry v.s. per-host entry


                                                  20
Overheads: Competition




                         21
Agenda
•   OF Benefits, Bottlenecks, and Dilemmas
•   Evaluation of Overheads
•   DevoFlow
•   Simulation Results




                                             22
Mechanisms
• Control
  – Rule cloning
  – Local actions
• Statistics-gathering
  – Sampling
  – Triggers and reports
  – Approximate counters
• Flow scheduler: like Hedera
• Multipath routing: based on probability dist.
     enable oblivious routing
                                                  23
Rule Cloning
• ASIC clones a wildcard rule as an exact
  match rule for new microflows
• Timeout or output port by probability




                                            24
Rule Cloning
• ASIC clones a wildcard rule as an exact
  match rule for new microflows
• Timeout or output port by probability




                                            25
Rule Cloning
• ASIC clones a wildcard rule as an exact
  match rule for new microflows
• Timeout or output port by probability




                                            26
Local Actions
• Rapid re-routing: fallback paths predefined
    Recover almost immediately
• Multipath support: based on probability dist.
  Adjusted by link capacity or loads




                                              27
Statistics-Gathering
• Sampling
  – Pkts headers send to controller with1/1000 prob.
• Triggers and reports
  – Set a threshold per rule
  – When exceeds, enable flow setup at controller
• Approximate counters
  – Maintain list of top-k largest flows




                                                    28
Implementation
• Not yet on hardware
• Engineers support this by using existing
  functional blocks for most mechanisms
• Provide some basic tools for SDN
• However, scaling remains a problem
     What threshold? How to sample? Rate?
• Default multipath on switches
• Controller samples or sets triggers to detect
  elephants, schedules by bin-packing algo.
                                              29
Simulation
• How much flow scheduling overheads can be
  reduced, while achieving high performance?
• Custom built flow-level simulator, based on
  5406zl experiments
• Workloads generated:
  – Reverse-engineered [30], by MSR, 1500-server
  – MapReduce shuffle stage, 128MB to each other
  – Combine these two

       [30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The
       Nature of Datacenter Trac: Measurements & Analysis. In
       Proc. IMC , 2009.                                               30
Simulation




             31
Agenda
•   OF Benefits, Bottlenecks, and Dilemmas
•   Evaluation of Overheads
•   DevoFlow
•   Simulation Results




                                             32
Simulation Results
Clos Topology




                                33
Simulation Results
Clos Topology




                                34
Simulation Results
HyperX Topology




                                  35
Simulation Results
HyperX Topology




                                  36
Simulation Results




                     37
Simulation Results




                     38
Simulation Results




                     39
Simulation Results




                     40
Simulation Results




                     41
Conclusion
• Per-flow control imposes too many overheads
• Balance between
  – Overheads and network visibility
  – Effective traffic engineering / network mgmt
      Could lead to various researches
• Switches w/ limited resources
  – Flow entries / control-plane BW
  – Hardware capability / power consumption



                                                   42

Contenu connexe

Tendances

VL2: A scalable and flexible Data Center Network
VL2: A scalable and flexible Data Center NetworkVL2: A scalable and flexible Data Center Network
VL2: A scalable and flexible Data Center NetworkAnkita Mahajan
 
Software Defined Networking: Network Virtualization
Software Defined Networking: Network VirtualizationSoftware Defined Networking: Network Virtualization
Software Defined Networking: Network VirtualizationNetCraftsmen
 
Die pacman nomaden opnfv summit 2016 berlin
Die pacman nomaden opnfv summit 2016 berlinDie pacman nomaden opnfv summit 2016 berlin
Die pacman nomaden opnfv summit 2016 berlinZhipeng Huang
 
AusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBRAusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBRAPNIC
 
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...Dmitry Afanasiev
 
Network and Service Virtualization tutorial at ONUG Spring 2015
Network and Service Virtualization tutorial at ONUG Spring 2015Network and Service Virtualization tutorial at ONUG Spring 2015
Network and Service Virtualization tutorial at ONUG Spring 2015SDN Hub
 
HPC Best Practices: Application Performance Optimization
HPC Best Practices: Application Performance OptimizationHPC Best Practices: Application Performance Optimization
HPC Best Practices: Application Performance Optimizationinside-BigData.com
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)MarkTaylorIBM
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloudinside-BigData.com
 
Qos Quality of services
Qos   Quality of services Qos   Quality of services
Qos Quality of services HayderThary
 
Hhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availabilityHhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availabilityPete Siddall
 
Understanding network and service virtualization
Understanding network and service virtualizationUnderstanding network and service virtualization
Understanding network and service virtualizationSDN Hub
 
Business Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksBusiness Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksTal Lavian Ph.D.
 
Remote core locking-Andrea Lombardo
Remote core locking-Andrea LombardoRemote core locking-Andrea Lombardo
Remote core locking-Andrea LombardoAndrea Lombardo
 
IBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsIBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsMarkTaylorIBM
 
Tackling Disaster in a SCM Environment
Tackling Disaster in a SCM EnvironmentTackling Disaster in a SCM Environment
Tackling Disaster in a SCM Environmentziaulm
 

Tendances (20)

VL2: A scalable and flexible Data Center Network
VL2: A scalable and flexible Data Center NetworkVL2: A scalable and flexible Data Center Network
VL2: A scalable and flexible Data Center Network
 
Software Defined Networking: Network Virtualization
Software Defined Networking: Network VirtualizationSoftware Defined Networking: Network Virtualization
Software Defined Networking: Network Virtualization
 
Die pacman nomaden opnfv summit 2016 berlin
Die pacman nomaden opnfv summit 2016 berlinDie pacman nomaden opnfv summit 2016 berlin
Die pacman nomaden opnfv summit 2016 berlin
 
Quality of service
Quality of serviceQuality of service
Quality of service
 
AusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBRAusNOG 2019: TCP and BBR
AusNOG 2019: TCP and BBR
 
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
MPLS in DC and inter-DC networks: the unified forwarding mechanism for networ...
 
Network and Service Virtualization tutorial at ONUG Spring 2015
Network and Service Virtualization tutorial at ONUG Spring 2015Network and Service Virtualization tutorial at ONUG Spring 2015
Network and Service Virtualization tutorial at ONUG Spring 2015
 
10 sdn-vir-6up
10 sdn-vir-6up10 sdn-vir-6up
10 sdn-vir-6up
 
HPC Best Practices: Application Performance Optimization
HPC Best Practices: Application Performance OptimizationHPC Best Practices: Application Performance Optimization
HPC Best Practices: Application Performance Optimization
 
IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)IBM MQ High Availabillity and Disaster Recovery (2017 version)
IBM MQ High Availabillity and Disaster Recovery (2017 version)
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloud
 
Qos Quality of services
Qos   Quality of services Qos   Quality of services
Qos Quality of services
 
Hhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availabilityHhm 3479 mq clustering and shared queues for high availability
Hhm 3479 mq clustering and shared queues for high availability
 
Understanding network and service virtualization
Understanding network and service virtualizationUnderstanding network and service virtualization
Understanding network and service virtualization
 
Решения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторовРешения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторов
 
Business Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksBusiness Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical Networks
 
Remote core locking-Andrea Lombardo
Remote core locking-Andrea LombardoRemote core locking-Andrea Lombardo
Remote core locking-Andrea Lombardo
 
IBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platformsIBM MQ - Comparing Distributed and z/OS platforms
IBM MQ - Comparing Distributed and z/OS platforms
 
Tackling Disaster in a SCM Environment
Tackling Disaster in a SCM EnvironmentTackling Disaster in a SCM Environment
Tackling Disaster in a SCM Environment
 
What's new in Neutron Juno
What's new in Neutron JunoWhat's new in Neutron Juno
What's new in Neutron Juno
 

Similaire à DevoFlow - Scaling Flow Management for High-Performance Networks

SDN Architecture & Ecosystem
SDN Architecture & EcosystemSDN Architecture & Ecosystem
SDN Architecture & EcosystemKingston Smiler
 
Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...
Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...
Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...Frank Dürr
 
lect4_SDNbasic_openflow.pptx
lect4_SDNbasic_openflow.pptxlect4_SDNbasic_openflow.pptx
lect4_SDNbasic_openflow.pptxJesicaDcruz1
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...balmanme
 
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time VariationUse of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time VariationJonathan Beard
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with PipeliningAneesh Raveendran
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...balmanme
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...NoSQLmatters
 
Multipath Load Balancing for SDN Data Plane
Multipath Load Balancing for SDN Data Plane Multipath Load Balancing for SDN Data Plane
Multipath Load Balancing for SDN Data Plane Sabelo Dlamini
 
Interpreting Performance Test Results
Interpreting Performance Test ResultsInterpreting Performance Test Results
Interpreting Performance Test ResultsEric Proegler
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Continuent
 
Inter-controller Traffic in ONOS Clusters for SDN Networks
Inter-controller Traffic in ONOS Clusters for SDN Networks Inter-controller Traffic in ONOS Clusters for SDN Networks
Inter-controller Traffic in ONOS Clusters for SDN Networks Paolo Giaccone
 
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...John Gunnels
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...WMLab,NCU
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...Miguel Xavier
 
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownTed Dunning
 

Similaire à DevoFlow - Scaling Flow Management for High-Performance Networks (20)

SDN Architecture & Ecosystem
SDN Architecture & EcosystemSDN Architecture & Ecosystem
SDN Architecture & Ecosystem
 
Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...
Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...
Improving the Efficiency of Cloud Infrastructures with Elastic Tandem Machine...
 
lect4_SDNbasic_openflow.pptx
lect4_SDNbasic_openflow.pptxlect4_SDNbasic_openflow.pptx
lect4_SDNbasic_openflow.pptx
 
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time VariationUse of a Levy Distribution for Modeling Best Case Execution Time Variation
Use of a Levy Distribution for Modeling Best Case Execution Time Variation
 
Performance Enhancement with Pipelining
Performance Enhancement with PipeliningPerformance Enhancement with Pipelining
Performance Enhancement with Pipelining
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
 
Multipath Load Balancing for SDN Data Plane
Multipath Load Balancing for SDN Data Plane Multipath Load Balancing for SDN Data Plane
Multipath Load Balancing for SDN Data Plane
 
Play With Streams
Play With StreamsPlay With Streams
Play With Streams
 
Sharam salamian
Sharam salamianSharam salamian
Sharam salamian
 
ThesisPresentation_Upd
ThesisPresentation_UpdThesisPresentation_Upd
ThesisPresentation_Upd
 
Interpreting Performance Test Results
Interpreting Performance Test ResultsInterpreting Performance Test Results
Interpreting Performance Test Results
 
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
Webinar Slides: Tungsten Connector / Proxy – The Secret Sauce Behind Zero-Dow...
 
Inter-controller Traffic in ONOS Clusters for SDN Networks
Inter-controller Traffic in ONOS Clusters for SDN Networks Inter-controller Traffic in ONOS Clusters for SDN Networks
Inter-controller Traffic in ONOS Clusters for SDN Networks
 
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
Making_Good_Enough...Better-Addressing_the_Multiple_Objectives_of_High-Perfor...
 
Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...Probabilistic consolidation of virtual machines in self organizing cloud data...
Probabilistic consolidation of virtual machines in self organizing cloud data...
 
4_SDN.pdf
4_SDN.pdf4_SDN.pdf
4_SDN.pdf
 
A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...A Performance Comparison of Container-based Virtualization Systems for MapRed...
A Performance Comparison of Container-based Virtualization Systems for MapRed...
 
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
 

Plus de Jason TC HOU (侯宗成)

Plus de Jason TC HOU (侯宗成) (9)

A Data Culture in Daily Work - Examples @ KKTV
A Data Culture in Daily Work - Examples @ KKTVA Data Culture in Daily Work - Examples @ KKTV
A Data Culture in Daily Work - Examples @ KKTV
 
Triangulating Data to Drive Growth
Triangulating Data to Drive GrowthTriangulating Data to Drive Growth
Triangulating Data to Drive Growth
 
Design & Growth @ KKTV - uP!ck Sharing
Design & Growth @ KKTV - uP!ck SharingDesign & Growth @ KKTV - uP!ck Sharing
Design & Growth @ KKTV - uP!ck Sharing
 
文武雙全的產品設計 DESIGNING WITH DATA
文武雙全的產品設計 DESIGNING WITH DATA文武雙全的產品設計 DESIGNING WITH DATA
文武雙全的產品設計 DESIGNING WITH DATA
 
Growth @ KKTV
Growth @ KKTVGrowth @ KKTV
Growth @ KKTV
 
Growth 的基石 用戶行為追蹤
Growth 的基石   用戶行為追蹤Growth 的基石   用戶行為追蹤
Growth 的基石 用戶行為追蹤
 
App 的隱形殺手 - 留存率
App 的隱形殺手 - 留存率App 的隱形殺手 - 留存率
App 的隱形殺手 - 留存率
 
Software-Defined Networking SDN - A Brief Introduction
Software-Defined Networking SDN - A Brief IntroductionSoftware-Defined Networking SDN - A Brief Introduction
Software-Defined Networking SDN - A Brief Introduction
 
OpenStack Framework Introduction
OpenStack Framework IntroductionOpenStack Framework Introduction
OpenStack Framework Introduction
 

DevoFlow - Scaling Flow Management for High-Performance Networks

  • 1. DevoFlow: Scaling Flow Management for High-Performance Networks Andrew R. Curtis (University of Waterloo); Jeffrey C. Mogul, Jean Tourrilhes, Praveen Yalagandula, Puneet Sharma, Sujata Banerjee (HP Labs), SIGCOMM 2011 Presenter: Jason, Tsung-Cheng, HOU Advisor: Wanjiun Liao Mar. 22nd, 2012 1
  • 2. Motivation • SDN / OpenFlow can enable per-flow management… However… • What are the costs and limitations? • Network-wide logical graph = always collecting all flows’ stat.s? • Any more problems beyond controller’s scalability? • Enhancing performance / scalability of controllers solves all problems? 2
  • 3. DevoFlow Contributions • Characterize overheads of implementing OpenFlow on switches • Evaluate flow mgmt capability within data center network environment • Propose DevoFlow to enable scalable flow mgmt by balancing – Network control – Statistics collection – Overheads – Switch functions and controller loads 3
  • 4. Agenda • OF Benefits, Bottlenecks, and Dilemmas • Evaluation of Overheads • DevoFlow • Simulation Results 4
  • 5. Benefits • Flexible policies w/o switch-by-sw config. • Network graph and visibility, stat.s collection • Enable traffic engineering and network mgmt • OpenFlow switches are relatively simple • Accelerate innovation: – VL2, PortLand: new architecture, virtualized addr – Hedera: flow scheduling – ElsticTree: energy-proportional networking • However, no further est. of overheads 5
  • 6. Bottlenecks • Root: Excessively couples.. – central control and complete visibility • Controller bottleneck: scale by dist. sys. • Switch bottleneck: – Data- to control-plane: limited BW – Enormous flow tables, too many entries – Control and stat.s pkts compete for BW – Introduce extra delays and latencies • Switch bottleneck was not well studied 6
  • 7. Dilemma • Control dilemma: – Role of controller: visibility and mgmt capability however, per-flow setup too costly – Flow-match wildcard, hash-based: much less load, but no effective control • Statistics-gathering dilemma: – Pull-based mechanism: counters of all flows full visibility but demand high BW – Wildcard counter aggregation: much less entries but lose trace of elephant flows • Aim to strike in between 7
  • 8. Main Concept of DevoFlow • Devolving most flow controls to switches • Maintain partial visibility • Keep trace of significant flows • Default v.s. special actions: – Security-sensitive flows: categorically inspect – Normal flows: may evolve or cover other flows become security-sensitive or significant – Significant flows: special attention • Collect stat.s by sampling, triggering, and approximating 8
  • 9. Design Principles of DevoFlow • Try to stay in data-plane, by default • Provide enough visibility: – Esp. for significant flows & sec-sensitive flows – Otherwise, aggregate or approximate stat.s • Maintain simplicity of switches 9
  • 10. Agenda • OF Benefits, Bottlenecks, and Dilemmas • Evaluation of Overheads • DevoFlow • Simulation Results 10
  • 11. Overheads: Control PKTs A N-switch path For a path with N switches: N+1 control pkts • First flow pkt to controller • N control messages to N switches Average length of a flow in 1997: 20 pkts In clos / fat-tree DCN topo: 5 switches 6 control pkts per flow The smaller the flow, the higher cost of BW 11
  • 12. Overheads: Flow Setup • Switch w/ finite BW between data / control plane, i.e. overheads between ASIC and CPU • Setup capability: 275~300 flows/sec • Similar with [30] • In data center: mean interarrival 30 ms • Rack w/ 40 servers 1300 flows/sec • In whole data center [43] R. Sherwood, G. Gibb, K.-K. Yap, G. Appenzeller, M. Casado, N. McKeown, and G. Parulkar. Can the production network be the testbed? In OSDI , 2010. 12
  • 13. Overheads: Flow Setup Experiment: a single switch 13
  • 14. Overheads: Flow Setup ASIC switching rate Latency: 5 s 14
  • 15. Overheads: Flow Setup ASIC CPU Latency: 0.5 ms 15
  • 16. Overheads: Flow Setup CPU Controller Latency: 2 ms A huge waste of resources! 16
  • 17. Overheads: Gathering Stat.s • [30] most longest-lived flows: only a few sec • Counters: (pkts, bytes, duration) • Push-based: to controller when flow ends • Pull-based: fetch actively by controller • 88F bytes for F flows • In 5406zl switch: Entries:1.5K wildcard match/13K exact match total 1.3 MB, 2 fetches/sec, 17 Mbps Not fast enough! Consumes a lot of BW! [30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The Nature of Datacenter Trac: Measurements & Analysis. In Proc. IMC , 2009. 17
  • 18. Overheads: Gathering Stat.s 2.5 sec to pull 13K entries 1 sec to pull 5,600 entries 0.5 sec to pull 3,200 entries 18
  • 19. Overheads: Gathering Stat.s • Per-flow setup generates too many entries • More the controller fetch longer • Longer to fetch longer the control loop • In Hedera: control loop 5 secs BUT workload too ideal, Pareto distribution • Workload in VL2, 5 sec only improves 1~5% over ECMP • [41], must be less than 0.5 sec to be better [41] C. Raiciu, C. Pluntke, S. Barre, A. Greenhalgh, D. Wischik, and M. Handley. Data center networking with multipath TCP. In HotNets , 2010. 19
  • 20. Overheads: Competition • Flow setups and stat-pulling compete for BW • Must need timely stat.s for scheduling • Switch flow entries – OpenFlow: TCAMs, wildcard, consumes lots of power & space – Rules: 10 header fields, 288 bits each – Only 60 bits for trad. Ethernet • Per-flow entry v.s. per-host entry 20
  • 22. Agenda • OF Benefits, Bottlenecks, and Dilemmas • Evaluation of Overheads • DevoFlow • Simulation Results 22
  • 23. Mechanisms • Control – Rule cloning – Local actions • Statistics-gathering – Sampling – Triggers and reports – Approximate counters • Flow scheduler: like Hedera • Multipath routing: based on probability dist. enable oblivious routing 23
  • 24. Rule Cloning • ASIC clones a wildcard rule as an exact match rule for new microflows • Timeout or output port by probability 24
  • 25. Rule Cloning • ASIC clones a wildcard rule as an exact match rule for new microflows • Timeout or output port by probability 25
  • 26. Rule Cloning • ASIC clones a wildcard rule as an exact match rule for new microflows • Timeout or output port by probability 26
  • 27. Local Actions • Rapid re-routing: fallback paths predefined Recover almost immediately • Multipath support: based on probability dist. Adjusted by link capacity or loads 27
  • 28. Statistics-Gathering • Sampling – Pkts headers send to controller with1/1000 prob. • Triggers and reports – Set a threshold per rule – When exceeds, enable flow setup at controller • Approximate counters – Maintain list of top-k largest flows 28
  • 29. Implementation • Not yet on hardware • Engineers support this by using existing functional blocks for most mechanisms • Provide some basic tools for SDN • However, scaling remains a problem What threshold? How to sample? Rate? • Default multipath on switches • Controller samples or sets triggers to detect elephants, schedules by bin-packing algo. 29
  • 30. Simulation • How much flow scheduling overheads can be reduced, while achieving high performance? • Custom built flow-level simulator, based on 5406zl experiments • Workloads generated: – Reverse-engineered [30], by MSR, 1500-server – MapReduce shuffle stage, 128MB to each other – Combine these two [30] S. Kandula, S. Sengupta, A. Greenberg, and P. Patel. The Nature of Datacenter Trac: Measurements & Analysis. In Proc. IMC , 2009. 30
  • 32. Agenda • OF Benefits, Bottlenecks, and Dilemmas • Evaluation of Overheads • DevoFlow • Simulation Results 32
  • 42. Conclusion • Per-flow control imposes too many overheads • Balance between – Overheads and network visibility – Effective traffic engineering / network mgmt Could lead to various researches • Switches w/ limited resources – Flow entries / control-plane BW – Hardware capability / power consumption 42