SlideShare une entreprise Scribd logo
1  sur  89
Télécharger pour lire hors ligne
15: Datacenter Design and Networking
Zubair Nabi
zubair.nabi@itu.edu.pk
April 21, 2013
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 1 / 27
Outline
1 Datacenter Topologies
2 Transport Protocols
3 Network Sharing
4 Wrapping Up
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 2 / 27
Outline
1 Datacenter Topologies
2 Transport Protocols
3 Network Sharing
4 Wrapping Up
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 3 / 27
Introduction
Datacenters are traditionally designed in the form of a 2/3-level tree
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
Introduction
Datacenters are traditionally designed in the form of a 2/3-level tree
Switching elements become more specialized and faster when we go
up the tree structure
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
Introduction
Datacenters are traditionally designed in the form of a 2/3-level tree
Switching elements become more specialized and faster when we go
up the tree structure
A three-level tree has a core switch at the root, aggregation switches
in the middle, and edge switches at the leaves of the tree
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
Introduction
Datacenters are traditionally designed in the form of a 2/3-level tree
Switching elements become more specialized and faster when we go
up the tree structure
A three-level tree has a core switch at the root, aggregation switches
in the middle, and edge switches at the leaves of the tree
Edge switches have a large number of 1Gbps ports and a small
number of 10Gbps ports
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
Introduction
Datacenters are traditionally designed in the form of a 2/3-level tree
Switching elements become more specialized and faster when we go
up the tree structure
A three-level tree has a core switch at the root, aggregation switches
in the middle, and edge switches at the leaves of the tree
Edge switches have a large number of 1Gbps ports and a small
number of 10Gbps ports
The 1Gbps ports connect end-hosts while 10Gbps ports connect to
aggregation switches
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
Introduction
Datacenters are traditionally designed in the form of a 2/3-level tree
Switching elements become more specialized and faster when we go
up the tree structure
A three-level tree has a core switch at the root, aggregation switches
in the middle, and edge switches at the leaves of the tree
Edge switches have a large number of 1Gbps ports and a small
number of 10Gbps ports
The 1Gbps ports connect end-hosts while 10Gbps ports connect to
aggregation switches
Aggregation and core switches have 10Gbps ports
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
Introduction
Datacenters are traditionally designed in the form of a 2/3-level tree
Switching elements become more specialized and faster when we go
up the tree structure
A three-level tree has a core switch at the root, aggregation switches
in the middle, and edge switches at the leaves of the tree
Edge switches have a large number of 1Gbps ports and a small
number of 10Gbps ports
The 1Gbps ports connect end-hosts while 10Gbps ports connect to
aggregation switches
Aggregation and core switches have 10Gbps ports
Partitioning if switches up the tree go down
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 5 / 27
Oversubscription
Ideal value of 1:1 – All hosts may potentially communicate with others
at full bandwidth of their interface
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 6 / 27
Oversubscription
Ideal value of 1:1 – All hosts may potentially communicate with others
at full bandwidth of their interface
5:1 – Only 20% of the bandwidth is available (200Mbps)
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 6 / 27
Oversubscription
Ideal value of 1:1 – All hosts may potentially communicate with others
at full bandwidth of their interface
5:1 – Only 20% of the bandwidth is available (200Mbps)
Typical datacenter designs are oversubscribed by a factor of 2.5:1
(400Mbps) to 8:1 (125Mbps)
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 6 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Each pod contains two layers of k/2 switches
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Each pod contains two layers of k/2 switches
Each k-port switch in the lower layer is directly connected to k/2 hosts
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Each pod contains two layers of k/2 switches
Each k-port switch in the lower layer is directly connected to k/2 hosts
Each of the remaining k/2 ports is connected to k/2 of the k ports of the
aggregation switches
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Each pod contains two layers of k/2 switches
Each k-port switch in the lower layer is directly connected to k/2 hosts
Each of the remaining k/2 ports is connected to k/2 of the k ports of the
aggregation switches
(k/2)2
core switches
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Each pod contains two layers of k/2 switches
Each k-port switch in the lower layer is directly connected to k/2 hosts
Each of the remaining k/2 ports is connected to k/2 of the k ports of the
aggregation switches
(k/2)2
core switches
Each core switch has one port connected to each of the k pods
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Each pod contains two layers of k/2 switches
Each k-port switch in the lower layer is directly connected to k/2 hosts
Each of the remaining k/2 ports is connected to k/2 of the k ports of the
aggregation switches
(k/2)2
core switches
Each core switch has one port connected to each of the k pods
The ith port of any core switch is connected to pod i
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Fat-tree Topology
k-ary fat-tree has k pods
Each pod contains two layers of k/2 switches
Each k-port switch in the lower layer is directly connected to k/2 hosts
Each of the remaining k/2 ports is connected to k/2 of the k ports of the
aggregation switches
(k/2)2
core switches
Each core switch has one port connected to each of the k pods
The ith port of any core switch is connected to pod i
A k-ary fat-tree supports k3
/4 hosts
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 8 / 27
DCell
Uses a recursively defined structure to interconnect servers
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
DCell
Uses a recursively defined structure to interconnect servers
Each server connects to different levels of DCells through multiple links
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
DCell
Uses a recursively defined structure to interconnect servers
Each server connects to different levels of DCells through multiple links
High-level DCells are built recursively from many low-level ones
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
DCell
Uses a recursively defined structure to interconnect servers
Each server connects to different levels of DCells through multiple links
High-level DCells are built recursively from many low-level ones
Fault tolerant as there is no single point of failure
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
Structure
Uses servers with multiple network ports and mini-switches to
construct its recursive structure
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
Structure
Uses servers with multiple network ports and mini-switches to
construct its recursive structure
DCell0 is the building block to construct larger DCells
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
Structure
Uses servers with multiple network ports and mini-switches to
construct its recursive structure
DCell0 is the building block to construct larger DCells
Consists of n servers and a mini-switch
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
Structure
Uses servers with multiple network ports and mini-switches to
construct its recursive structure
DCell0 is the building block to construct larger DCells
Consists of n servers and a mini-switch
High-level DCells are built recursively from many low-level ones
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
Structure
Uses servers with multiple network ports and mini-switches to
construct its recursive structure
DCell0 is the building block to construct larger DCells
Consists of n servers and a mini-switch
High-level DCells are built recursively from many low-level ones
DCell1 constructed using n +1 DCell0s
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
Structure
Uses servers with multiple network ports and mini-switches to
construct its recursive structure
DCell0 is the building block to construct larger DCells
Consists of n servers and a mini-switch
High-level DCells are built recursively from many low-level ones
DCell1 constructed using n +1 DCell0s
The same applies to DCellk
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 11 / 27
Outline
1 Datacenter Topologies
2 Transport Protocols
3 Network Sharing
4 Wrapping Up
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 12 / 27
TCP and UDP
TCP: Connection-oriented with reliability, ordering, and congestion
control
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 13 / 27
TCP and UDP
TCP: Connection-oriented with reliability, ordering, and congestion
control
UDP: Connectionless with no ordering, reliability, or congestion control
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 13 / 27
TCP and Datacenter Networks
Communication between different nodes is thought of as just opening a
TCP connection between them
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
TCP and Datacenter Networks
Communication between different nodes is thought of as just opening a
TCP connection between them
Common sockets API
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
TCP and Datacenter Networks
Communication between different nodes is thought of as just opening a
TCP connection between them
Common sockets API
But TCP was designed for a wide-area network
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
TCP and Datacenter Networks
Communication between different nodes is thought of as just opening a
TCP connection between them
Common sockets API
But TCP was designed for a wide-area network
Clearly, a datacenter is not a wide-area network
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
TCP and Datacenter Networks
Communication between different nodes is thought of as just opening a
TCP connection between them
Common sockets API
But TCP was designed for a wide-area network
Clearly, a datacenter is not a wide-area network
Significantly different bandwidth-delay product, round-trip time (RTT),
and retransmission timeout (RTO)
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
TCP and Datacenter Networks
Communication between different nodes is thought of as just opening a
TCP connection between them
Common sockets API
But TCP was designed for a wide-area network
Clearly, a datacenter is not a wide-area network
Significantly different bandwidth-delay product, round-trip time (RTT),
and retransmission timeout (RTO)
For example, due to the low RTT, the congestion window for each flow
is very small
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
TCP and Datacenter Networks
Communication between different nodes is thought of as just opening a
TCP connection between them
Common sockets API
But TCP was designed for a wide-area network
Clearly, a datacenter is not a wide-area network
Significantly different bandwidth-delay product, round-trip time (RTT),
and retransmission timeout (RTO)
For example, due to the low RTT, the congestion window for each flow
is very small
As a result, flow recovery through TCP fast retransmit is impossible,
leading to poor net throughput
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
More problems for TCP
In production data centers, due to the widely-varying mix of
applications, congestion in the network can last from 10s to 100s of
seconds
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
More problems for TCP
In production data centers, due to the widely-varying mix of
applications, congestion in the network can last from 10s to 100s of
seconds
In commodity switches the buffer pool is shared by all interfaces
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
More problems for TCP
In production data centers, due to the widely-varying mix of
applications, congestion in the network can last from 10s to 100s of
seconds
In commodity switches the buffer pool is shared by all interfaces
If long flows hog the memory, queues can build up for the short flows
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
More problems for TCP
In production data centers, due to the widely-varying mix of
applications, congestion in the network can last from 10s to 100s of
seconds
In commodity switches the buffer pool is shared by all interfaces
If long flows hog the memory, queues can build up for the short flows
Many-to-one communication patterns can lead to TCP throughput
collapse or incast
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
More problems for TCP
In production data centers, due to the widely-varying mix of
applications, congestion in the network can last from 10s to 100s of
seconds
In commodity switches the buffer pool is shared by all interfaces
If long flows hog the memory, queues can build up for the short flows
Many-to-one communication patterns can lead to TCP throughput
collapse or incast
This can cause overall application throughput to decrease by up to 90%
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
More problems for TCP
In production data centers, due to the widely-varying mix of
applications, congestion in the network can last from 10s to 100s of
seconds
In commodity switches the buffer pool is shared by all interfaces
If long flows hog the memory, queues can build up for the short flows
Many-to-one communication patterns can lead to TCP throughput
collapse or incast
This can cause overall application throughput to decrease by up to 90%
In virtualized environments, the time sharing of resources increases
the latency faced by the VMs
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
More problems for TCP
In production data centers, due to the widely-varying mix of
applications, congestion in the network can last from 10s to 100s of
seconds
In commodity switches the buffer pool is shared by all interfaces
If long flows hog the memory, queues can build up for the short flows
Many-to-one communication patterns can lead to TCP throughput
collapse or incast
This can cause overall application throughput to decrease by up to 90%
In virtualized environments, the time sharing of resources increases
the latency faced by the VMs
This latency can be orders of magnitude higher than the RTT between
hosts inside a datacenter, leading to slow progress of TCP connections
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
Reaction
Some large-scale deployments have abandoned TCP altogether
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
Reaction
Some large-scale deployments have abandoned TCP altogether
For instance, Facebook now uses a custom UDP transport
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
Reaction
Some large-scale deployments have abandoned TCP altogether
For instance, Facebook now uses a custom UDP transport
It might be a “kitchen-sink” solution but it is sub-optimal in a datacenter
environment
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
Reaction
Some large-scale deployments have abandoned TCP altogether
For instance, Facebook now uses a custom UDP transport
It might be a “kitchen-sink” solution but it is sub-optimal in a datacenter
environment
Over the years, a number of alternatives have been proposed
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
Datacenter TCP (DCTCP)
Uses Explicit Congestion Notifications (ECN) from switches to perform
active queue management-based congestion control
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
Datacenter TCP (DCTCP)
Uses Explicit Congestion Notifications (ECN) from switches to perform
active queue management-based congestion control
Switches set the congestion experienced flag in packets whenever the
buffer occupancy exceeds a small threshold
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
Datacenter TCP (DCTCP)
Uses Explicit Congestion Notifications (ECN) from switches to perform
active queue management-based congestion control
Switches set the congestion experienced flag in packets whenever the
buffer occupancy exceeds a small threshold
DCTCP uses this information to reduce the size of the window based
on a fraction of the marked packets
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
Datacenter TCP (DCTCP)
Uses Explicit Congestion Notifications (ECN) from switches to perform
active queue management-based congestion control
Switches set the congestion experienced flag in packets whenever the
buffer occupancy exceeds a small threshold
DCTCP uses this information to reduce the size of the window based
on a fraction of the marked packets
Enables it to react quickly to queue build and avoid buffer pressure
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
Multipath TCP (MPTCP)
Establishes multiple subflows over different paths between a pair of
end-hosts
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
Multipath TCP (MPTCP)
Establishes multiple subflows over different paths between a pair of
end-hosts
These subflows operate under a single TCP connection
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
Multipath TCP (MPTCP)
Establishes multiple subflows over different paths between a pair of
end-hosts
These subflows operate under a single TCP connection
The fraction of the total congestion window for each flow is determined
by its speed
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
Multipath TCP (MPTCP)
Establishes multiple subflows over different paths between a pair of
end-hosts
These subflows operate under a single TCP connection
The fraction of the total congestion window for each flow is determined
by its speed
Moves traffic away from the most congested paths
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
tcpcrypt
Backwards compatible enhancement to TCP that aims to efficiently
and transparently provide encrypted communication to applications
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
tcpcrypt
Backwards compatible enhancement to TCP that aims to efficiently
and transparently provide encrypted communication to applications
Uses a custom key exchange protocol that leverages the TCP options
field
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
tcpcrypt
Backwards compatible enhancement to TCP that aims to efficiently
and transparently provide encrypted communication to applications
Uses a custom key exchange protocol that leverages the TCP options
field
Like SSL, to reduce the cost of connection setup for short-lived flows, it
enables cryptographic state from one TCP connection to bootstrap
subsequent ones
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
tcpcrypt
Backwards compatible enhancement to TCP that aims to efficiently
and transparently provide encrypted communication to applications
Uses a custom key exchange protocol that leverages the TCP options
field
Like SSL, to reduce the cost of connection setup for short-lived flows, it
enables cryptographic state from one TCP connection to bootstrap
subsequent ones
Applications can also be made aware of the presence of tcpcrypt to
negate redundant encryption
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
Deadline-Driven Delivery (D3
)
Targets applications with distributed workflow and latency targets
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 20 / 27
Deadline-Driven Delivery (D3
)
Targets applications with distributed workflow and latency targets
Such applications associate a deadline with each network flow and the
flow is only useful if the deadline is met
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 20 / 27
Deadline-Driven Delivery (D3
)
Targets applications with distributed workflow and latency targets
Such applications associate a deadline with each network flow and the
flow is only useful if the deadline is met
Applications expose flow deadline and size information which is
exploited by end hosts to request rates from routers along the data path
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 20 / 27
Outline
1 Datacenter Topologies
2 Transport Protocols
3 Network Sharing
4 Wrapping Up
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 21 / 27
Introduction
Network resources are shared amongst the tenants, which can lead to
contention and other undesired behaviour
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
Introduction
Network resources are shared amongst the tenants, which can lead to
contention and other undesired behaviour
Network performance isolation between tenants can be an important
tool for:
Minimizing disruption from legitimate tenants that run network-intensive
workloads
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
Introduction
Network resources are shared amongst the tenants, which can lead to
contention and other undesired behaviour
Network performance isolation between tenants can be an important
tool for:
Minimizing disruption from legitimate tenants that run network-intensive
workloads
Protecting against malicious tenants that launch DoS attacks
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
Introduction
Network resources are shared amongst the tenants, which can lead to
contention and other undesired behaviour
Network performance isolation between tenants can be an important
tool for:
Minimizing disruption from legitimate tenants that run network-intensive
workloads
Protecting against malicious tenants that launch DoS attacks
The standard methodology to ensure isolation is to use VLANs
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
Virtual LAN
Acts like an ordinary LAN but end-hosts do no necessarily have to be
physically connected to the same segment
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
Virtual LAN
Acts like an ordinary LAN but end-hosts do no necessarily have to be
physically connected to the same segment
Nodes are grouped together by the VLAN
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
Virtual LAN
Acts like an ordinary LAN but end-hosts do no necessarily have to be
physically connected to the same segment
Nodes are grouped together by the VLAN
Broadcasts can also be sent within the same VLAN
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
Virtual LAN
Acts like an ordinary LAN but end-hosts do no necessarily have to be
physically connected to the same segment
Nodes are grouped together by the VLAN
Broadcasts can also be sent within the same VLAN
VLAN membership information is inserted into Ethernet frames
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
Rate-limiting End-hosts
In Xen the network bandwidth available to each domU can be rate
limited
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 24 / 27
Rate-limiting End-hosts
In Xen the network bandwidth available to each domU can be rate
limited
Can be used to implement basic QoS
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 24 / 27
Rate-limiting End-hosts
In Xen the network bandwidth available to each domU can be rate
limited
Can be used to implement basic QoS
The virtual interface is simply rate-limited
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 24 / 27
Outline
1 Datacenter Topologies
2 Transport Protocols
3 Network Sharing
4 Wrapping Up
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 25 / 27
The End
In reverse order:
1 Cloud stacks be used to turn clusters and datacenters into private and
public clouds
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
The End
In reverse order:
1 Cloud stacks be used to turn clusters and datacenters into private and
public clouds
2 Virtualization of computation, storage, and networking can allow many
tenants to co-exist
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
The End
In reverse order:
1 Cloud stacks be used to turn clusters and datacenters into private and
public clouds
2 Virtualization of computation, storage, and networking can allow many
tenants to co-exist
3 Most data does not fit the relational model and is more suited for
NoSQL stores
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
The End
In reverse order:
1 Cloud stacks be used to turn clusters and datacenters into private and
public clouds
2 Virtualization of computation, storage, and networking can allow many
tenants to co-exist
3 Most data does not fit the relational model and is more suited for
NoSQL stores
4 Data-intensive, task-parallel frameworks abstract away the details of
distribution, work allocation, sychronization, concurreny, and
communication; Perfect match for the cloud
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
The End
In reverse order:
1 Cloud stacks be used to turn clusters and datacenters into private and
public clouds
2 Virtualization of computation, storage, and networking can allow many
tenants to co-exist
3 Most data does not fit the relational model and is more suited for
NoSQL stores
4 Data-intensive, task-parallel frameworks abstract away the details of
distribution, work allocation, sychronization, concurreny, and
communication; Perfect match for the cloud
5 The future is Big Data and Cloud Computing!
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
References
1 Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A
scalable, commodity data center network architecture. In Proceedings
of the ACM SIGCOMM 2008 conference on Data communication
(SIGCOMM ’08). ACM, New York, NY, USA, 63-74.
2 Chuanxiong Guo, Haitao Wu, Kun Tan, Lei Shi, Yongguang Zhang, and
Songwu Lu. 2008. Dcell: a scalable and fault-tolerant network
structure for data centers. In Proceedings of the ACM SIGCOMM 2008
conference on Data communication (SIGCOMM ’08). ACM, New York,
NY, USA, 75-86.
Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 27 / 27

Contenu connexe

En vedette

AOS Lab 8: Interrupts and Device Drivers
AOS Lab 8: Interrupts and Device DriversAOS Lab 8: Interrupts and Device Drivers
AOS Lab 8: Interrupts and Device DriversZubair Nabi
 
AOS Lab 5: System calls
AOS Lab 5: System callsAOS Lab 5: System calls
AOS Lab 5: System callsZubair Nabi
 
AOS Lab 6: Scheduling
AOS Lab 6: SchedulingAOS Lab 6: Scheduling
AOS Lab 6: SchedulingZubair Nabi
 
AOS Lab 9: File system -- Of buffers, logs, and blocks
AOS Lab 9: File system -- Of buffers, logs, and blocksAOS Lab 9: File system -- Of buffers, logs, and blocks
AOS Lab 9: File system -- Of buffers, logs, and blocksZubair Nabi
 
AOS Lab 2: Hello, xv6!
AOS Lab 2: Hello, xv6!AOS Lab 2: Hello, xv6!
AOS Lab 2: Hello, xv6!Zubair Nabi
 
Topic 14: Operating Systems and Virtualization
Topic 14: Operating Systems and VirtualizationTopic 14: Operating Systems and Virtualization
Topic 14: Operating Systems and VirtualizationZubair Nabi
 
AOS Lab 12: Network Communication
AOS Lab 12: Network CommunicationAOS Lab 12: Network Communication
AOS Lab 12: Network CommunicationZubair Nabi
 
Re-architecting the Datacenter to Deliver Better Experiences (Intel)
Re-architecting the Datacenter to Deliver Better Experiences (Intel)Re-architecting the Datacenter to Deliver Better Experiences (Intel)
Re-architecting the Datacenter to Deliver Better Experiences (Intel)COMPUTEX TAIPEI
 
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCPaco Nathan
 
Data center Building & General Specification
Data center Building & General Specification Data center Building & General Specification
Data center Building & General Specification Ali Mirfallah
 
Network soft layer(20141222-2)
Network soft layer(20141222-2)Network soft layer(20141222-2)
Network soft layer(20141222-2)Yasuhiro Arai
 
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...ldangelo0772
 
Datacenter event - green it amsterdam - maikel bouricius - 15-09-14
Datacenter event -  green it amsterdam - maikel bouricius - 15-09-14Datacenter event -  green it amsterdam - maikel bouricius - 15-09-14
Datacenter event - green it amsterdam - maikel bouricius - 15-09-14Karim Network
 
OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...
OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...
OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...eNovance
 
Datacenter Revolution Dean Nelson, Sun
Datacenter  Revolution    Dean  Nelson,  SunDatacenter  Revolution    Dean  Nelson,  Sun
Datacenter Revolution Dean Nelson, SunNiklas Johnsson
 
Datacenter Design - DP Air
Datacenter Design - DP AirDatacenter Design - DP Air
Datacenter Design - DP Airdpsir
 
TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...
TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...
TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...itnewsafrica
 
Welcome to Hybrid Cloud Innovation Tour 2016
Welcome to Hybrid Cloud Innovation Tour 2016Welcome to Hybrid Cloud Innovation Tour 2016
Welcome to Hybrid Cloud Innovation Tour 2016LaurenWendler
 

En vedette (19)

AOS Lab 8: Interrupts and Device Drivers
AOS Lab 8: Interrupts and Device DriversAOS Lab 8: Interrupts and Device Drivers
AOS Lab 8: Interrupts and Device Drivers
 
AOS Lab 5: System calls
AOS Lab 5: System callsAOS Lab 5: System calls
AOS Lab 5: System calls
 
AOS Lab 6: Scheduling
AOS Lab 6: SchedulingAOS Lab 6: Scheduling
AOS Lab 6: Scheduling
 
AOS Lab 9: File system -- Of buffers, logs, and blocks
AOS Lab 9: File system -- Of buffers, logs, and blocksAOS Lab 9: File system -- Of buffers, logs, and blocks
AOS Lab 9: File system -- Of buffers, logs, and blocks
 
AOS Lab 2: Hello, xv6!
AOS Lab 2: Hello, xv6!AOS Lab 2: Hello, xv6!
AOS Lab 2: Hello, xv6!
 
Topic 14: Operating Systems and Virtualization
Topic 14: Operating Systems and VirtualizationTopic 14: Operating Systems and Virtualization
Topic 14: Operating Systems and Virtualization
 
AOS Lab 12: Network Communication
AOS Lab 12: Network CommunicationAOS Lab 12: Network Communication
AOS Lab 12: Network Communication
 
Re-architecting the Datacenter to Deliver Better Experiences (Intel)
Re-architecting the Datacenter to Deliver Better Experiences (Intel)Re-architecting the Datacenter to Deliver Better Experiences (Intel)
Re-architecting the Datacenter to Deliver Better Experiences (Intel)
 
Datacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DCDatacenter Computing with Apache Mesos - BigData DC
Datacenter Computing with Apache Mesos - BigData DC
 
Data center Building & General Specification
Data center Building & General Specification Data center Building & General Specification
Data center Building & General Specification
 
Network soft layer(20141222-2)
Network soft layer(20141222-2)Network soft layer(20141222-2)
Network soft layer(20141222-2)
 
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
Cisco at VMworld 2015 - Cisco UCS as the Foundation for Software-Defined Data...
 
Datacenter event - green it amsterdam - maikel bouricius - 15-09-14
Datacenter event -  green it amsterdam - maikel bouricius - 15-09-14Datacenter event -  green it amsterdam - maikel bouricius - 15-09-14
Datacenter event - green it amsterdam - maikel bouricius - 15-09-14
 
OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...
OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...
OpenStack in Action 4! Jean-Louis Lezaun - Re-architecturing the datacenter :...
 
Datacenter Revolution Dean Nelson, Sun
Datacenter  Revolution    Dean  Nelson,  SunDatacenter  Revolution    Dean  Nelson,  Sun
Datacenter Revolution Dean Nelson, Sun
 
Datacenter Design - DP Air
Datacenter Design - DP AirDatacenter Design - DP Air
Datacenter Design - DP Air
 
TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...
TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...
TECHNOLOGY ACCELERATING INFRASTRUCTURE DEVELOPMENT FOR ATTAINING THE NIGERIAN...
 
Welcome to Hybrid Cloud Innovation Tour 2016
Welcome to Hybrid Cloud Innovation Tour 2016Welcome to Hybrid Cloud Innovation Tour 2016
Welcome to Hybrid Cloud Innovation Tour 2016
 
Network Repairs
Network RepairsNetwork Repairs
Network Repairs
 

Similaire à Topic 15: Datacenter Design and Networking

Topic 12: NoSQL in Action
Topic 12: NoSQL in ActionTopic 12: NoSQL in Action
Topic 12: NoSQL in ActionZubair Nabi
 
The architecture of oak
The architecture of oakThe architecture of oak
The architecture of oakMichael Dürig
 
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.euDatabase migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eualdaschwede80
 
Topic 8: Enhancements and Alternative Architectures
Topic 8: Enhancements and Alternative ArchitecturesTopic 8: Enhancements and Alternative Architectures
Topic 8: Enhancements and Alternative ArchitecturesZubair Nabi
 
BGP Traffic Engineering / Routing Optimisation
BGP Traffic Engineering / Routing OptimisationBGP Traffic Engineering / Routing Optimisation
BGP Traffic Engineering / Routing OptimisationAndy Davidson
 
Large Partially-connected Erlang Clusters
 Large Partially-connected Erlang Clusters Large Partially-connected Erlang Clusters
Large Partially-connected Erlang ClustersMotiejus Jakštys
 
Lab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using MininetLab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using MininetZubair Nabi
 
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...Precisely
 
Public Seminar_Final 18112014
Public Seminar_Final 18112014Public Seminar_Final 18112014
Public Seminar_Final 18112014Hossam Hassan
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changesconfluent
 
MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small
MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small
MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small MUCUGL
 
Leveraging Endpoint Flexibility in Data-Intensive Clusters
Leveraging Endpoint Flexibility in Data-Intensive ClustersLeveraging Endpoint Flexibility in Data-Intensive Clusters
Leveraging Endpoint Flexibility in Data-Intensive ClustersRan Ziv
 
PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links
PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links  PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links
PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links PROIDEA
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure bloomreacheng
 
Efficient Data Center Virtualization with QLogic 10GbE Solutions from HP
Efficient Data Center Virtualization with QLogic 10GbE Solutions from HPEfficient Data Center Virtualization with QLogic 10GbE Solutions from HP
Efficient Data Center Virtualization with QLogic 10GbE Solutions from HPJone Smith
 
Computer Networking
Computer NetworkingComputer Networking
Computer NetworkingRanjan K.M.
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016Luigi Dell'Aquila
 
Campus networks best practices core and edges network
Campus networks best practices core and edges networkCampus networks best practices core and edges network
Campus networks best practices core and edges networkAshish Thomas
 
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...Mason Mei
 

Similaire à Topic 15: Datacenter Design and Networking (20)

Topic 12: NoSQL in Action
Topic 12: NoSQL in ActionTopic 12: NoSQL in Action
Topic 12: NoSQL in Action
 
The architecture of oak
The architecture of oakThe architecture of oak
The architecture of oak
 
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.euDatabase migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
Database migration from Sybase ASE to PostgreSQL @2013.pgconf.eu
 
Topic 8: Enhancements and Alternative Architectures
Topic 8: Enhancements and Alternative ArchitecturesTopic 8: Enhancements and Alternative Architectures
Topic 8: Enhancements and Alternative Architectures
 
BGP Traffic Engineering / Routing Optimisation
BGP Traffic Engineering / Routing OptimisationBGP Traffic Engineering / Routing Optimisation
BGP Traffic Engineering / Routing Optimisation
 
Large Partially-connected Erlang Clusters
 Large Partially-connected Erlang Clusters Large Partially-connected Erlang Clusters
Large Partially-connected Erlang Clusters
 
Lab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using MininetLab 5: Interconnecting a Datacenter using Mininet
Lab 5: Interconnecting a Datacenter using Mininet
 
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
IMS to DB2 Migration: How a Fortune 500 Company Made the Move in Record Time ...
 
Public Seminar_Final 18112014
Public Seminar_Final 18112014Public Seminar_Final 18112014
Public Seminar_Final 18112014
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changes
 
MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small
MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small
MUCUGL October 2013 - Lync Server Top To Bottom, Big To Small
 
Leveraging Endpoint Flexibility in Data-Intensive Clusters
Leveraging Endpoint Flexibility in Data-Intensive ClustersLeveraging Endpoint Flexibility in Data-Intensive Clusters
Leveraging Endpoint Flexibility in Data-Intensive Clusters
 
PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links
PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links  PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links
PLNOG 9: Donald E. Eastlake 3rd - Transparent Interconnection of Lost of Links
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure
 
Efficient Data Center Virtualization with QLogic 10GbE Solutions from HP
Efficient Data Center Virtualization with QLogic 10GbE Solutions from HPEfficient Data Center Virtualization with QLogic 10GbE Solutions from HP
Efficient Data Center Virtualization with QLogic 10GbE Solutions from HP
 
Network design assignment
Network design assignmentNetwork design assignment
Network design assignment
 
Computer Networking
Computer NetworkingComputer Networking
Computer Networking
 
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
OrientDB - the 2nd generation of (Multi-Model) NoSQL - Codemotion Warsaw 2016
 
Campus networks best practices core and edges network
Campus networks best practices core and edges networkCampus networks best practices core and edges network
Campus networks best practices core and edges network
 
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Eth...
 

Plus de Zubair Nabi

AOS Lab 1: Hello, Linux!
AOS Lab 1: Hello, Linux!AOS Lab 1: Hello, Linux!
AOS Lab 1: Hello, Linux!Zubair Nabi
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data StackZubair Nabi
 
Raabta: Low-cost Video Conferencing for the Developing World
Raabta: Low-cost Video Conferencing for the Developing WorldRaabta: Low-cost Video Conferencing for the Developing World
Raabta: Low-cost Video Conferencing for the Developing WorldZubair Nabi
 
The Anatomy of Web Censorship in Pakistan
The Anatomy of Web Censorship in PakistanThe Anatomy of Web Censorship in Pakistan
The Anatomy of Web Censorship in PakistanZubair Nabi
 
MapReduce and DBMS Hybrids
MapReduce and DBMS HybridsMapReduce and DBMS Hybrids
MapReduce and DBMS HybridsZubair Nabi
 
MapReduce Application Scripting
MapReduce Application ScriptingMapReduce Application Scripting
MapReduce Application ScriptingZubair Nabi
 
Lab 4: Interfacing with Cassandra
Lab 4: Interfacing with CassandraLab 4: Interfacing with Cassandra
Lab 4: Interfacing with CassandraZubair Nabi
 
Topic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageTopic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageZubair Nabi
 
Topic 11: Google Filesystem
Topic 11: Google FilesystemTopic 11: Google Filesystem
Topic 11: Google FilesystemZubair Nabi
 
Lab 3: Writing a Naiad Application
Lab 3: Writing a Naiad ApplicationLab 3: Writing a Naiad Application
Lab 3: Writing a Naiad ApplicationZubair Nabi
 
Topic 7: Shortcomings in the MapReduce Paradigm
Topic 7: Shortcomings in the MapReduce ParadigmTopic 7: Shortcomings in the MapReduce Paradigm
Topic 7: Shortcomings in the MapReduce ParadigmZubair Nabi
 
Lab 1: Introduction to Amazon EC2 and MPI
Lab 1: Introduction to Amazon EC2 and MPILab 1: Introduction to Amazon EC2 and MPI
Lab 1: Introduction to Amazon EC2 and MPIZubair Nabi
 
Topic 6: MapReduce Applications
Topic 6: MapReduce ApplicationsTopic 6: MapReduce Applications
Topic 6: MapReduce ApplicationsZubair Nabi
 

Plus de Zubair Nabi (14)

AOS Lab 1: Hello, Linux!
AOS Lab 1: Hello, Linux!AOS Lab 1: Hello, Linux!
AOS Lab 1: Hello, Linux!
 
The Big Data Stack
The Big Data StackThe Big Data Stack
The Big Data Stack
 
Raabta: Low-cost Video Conferencing for the Developing World
Raabta: Low-cost Video Conferencing for the Developing WorldRaabta: Low-cost Video Conferencing for the Developing World
Raabta: Low-cost Video Conferencing for the Developing World
 
The Anatomy of Web Censorship in Pakistan
The Anatomy of Web Censorship in PakistanThe Anatomy of Web Censorship in Pakistan
The Anatomy of Web Censorship in Pakistan
 
MapReduce and DBMS Hybrids
MapReduce and DBMS HybridsMapReduce and DBMS Hybrids
MapReduce and DBMS Hybrids
 
MapReduce Application Scripting
MapReduce Application ScriptingMapReduce Application Scripting
MapReduce Application Scripting
 
Lab 4: Interfacing with Cassandra
Lab 4: Interfacing with CassandraLab 4: Interfacing with Cassandra
Lab 4: Interfacing with Cassandra
 
Topic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and StorageTopic 10: Taxonomy of Data and Storage
Topic 10: Taxonomy of Data and Storage
 
Topic 11: Google Filesystem
Topic 11: Google FilesystemTopic 11: Google Filesystem
Topic 11: Google Filesystem
 
Lab 3: Writing a Naiad Application
Lab 3: Writing a Naiad ApplicationLab 3: Writing a Naiad Application
Lab 3: Writing a Naiad Application
 
Topic 9: MR+
Topic 9: MR+Topic 9: MR+
Topic 9: MR+
 
Topic 7: Shortcomings in the MapReduce Paradigm
Topic 7: Shortcomings in the MapReduce ParadigmTopic 7: Shortcomings in the MapReduce Paradigm
Topic 7: Shortcomings in the MapReduce Paradigm
 
Lab 1: Introduction to Amazon EC2 and MPI
Lab 1: Introduction to Amazon EC2 and MPILab 1: Introduction to Amazon EC2 and MPI
Lab 1: Introduction to Amazon EC2 and MPI
 
Topic 6: MapReduce Applications
Topic 6: MapReduce ApplicationsTopic 6: MapReduce Applications
Topic 6: MapReduce Applications
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Topic 15: Datacenter Design and Networking

  • 1. 15: Datacenter Design and Networking Zubair Nabi zubair.nabi@itu.edu.pk April 21, 2013 Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 1 / 27
  • 2. Outline 1 Datacenter Topologies 2 Transport Protocols 3 Network Sharing 4 Wrapping Up Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 2 / 27
  • 3. Outline 1 Datacenter Topologies 2 Transport Protocols 3 Network Sharing 4 Wrapping Up Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 3 / 27
  • 4. Introduction Datacenters are traditionally designed in the form of a 2/3-level tree Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
  • 5. Introduction Datacenters are traditionally designed in the form of a 2/3-level tree Switching elements become more specialized and faster when we go up the tree structure Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
  • 6. Introduction Datacenters are traditionally designed in the form of a 2/3-level tree Switching elements become more specialized and faster when we go up the tree structure A three-level tree has a core switch at the root, aggregation switches in the middle, and edge switches at the leaves of the tree Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
  • 7. Introduction Datacenters are traditionally designed in the form of a 2/3-level tree Switching elements become more specialized and faster when we go up the tree structure A three-level tree has a core switch at the root, aggregation switches in the middle, and edge switches at the leaves of the tree Edge switches have a large number of 1Gbps ports and a small number of 10Gbps ports Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
  • 8. Introduction Datacenters are traditionally designed in the form of a 2/3-level tree Switching elements become more specialized and faster when we go up the tree structure A three-level tree has a core switch at the root, aggregation switches in the middle, and edge switches at the leaves of the tree Edge switches have a large number of 1Gbps ports and a small number of 10Gbps ports The 1Gbps ports connect end-hosts while 10Gbps ports connect to aggregation switches Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
  • 9. Introduction Datacenters are traditionally designed in the form of a 2/3-level tree Switching elements become more specialized and faster when we go up the tree structure A three-level tree has a core switch at the root, aggregation switches in the middle, and edge switches at the leaves of the tree Edge switches have a large number of 1Gbps ports and a small number of 10Gbps ports The 1Gbps ports connect end-hosts while 10Gbps ports connect to aggregation switches Aggregation and core switches have 10Gbps ports Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
  • 10. Introduction Datacenters are traditionally designed in the form of a 2/3-level tree Switching elements become more specialized and faster when we go up the tree structure A three-level tree has a core switch at the root, aggregation switches in the middle, and edge switches at the leaves of the tree Edge switches have a large number of 1Gbps ports and a small number of 10Gbps ports The 1Gbps ports connect end-hosts while 10Gbps ports connect to aggregation switches Aggregation and core switches have 10Gbps ports Partitioning if switches up the tree go down Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 4 / 27
  • 11. Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 5 / 27
  • 12. Oversubscription Ideal value of 1:1 – All hosts may potentially communicate with others at full bandwidth of their interface Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 6 / 27
  • 13. Oversubscription Ideal value of 1:1 – All hosts may potentially communicate with others at full bandwidth of their interface 5:1 – Only 20% of the bandwidth is available (200Mbps) Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 6 / 27
  • 14. Oversubscription Ideal value of 1:1 – All hosts may potentially communicate with others at full bandwidth of their interface 5:1 – Only 20% of the bandwidth is available (200Mbps) Typical datacenter designs are oversubscribed by a factor of 2.5:1 (400Mbps) to 8:1 (125Mbps) Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 6 / 27
  • 15. Fat-tree Topology k-ary fat-tree has k pods Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 16. Fat-tree Topology k-ary fat-tree has k pods Each pod contains two layers of k/2 switches Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 17. Fat-tree Topology k-ary fat-tree has k pods Each pod contains two layers of k/2 switches Each k-port switch in the lower layer is directly connected to k/2 hosts Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 18. Fat-tree Topology k-ary fat-tree has k pods Each pod contains two layers of k/2 switches Each k-port switch in the lower layer is directly connected to k/2 hosts Each of the remaining k/2 ports is connected to k/2 of the k ports of the aggregation switches Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 19. Fat-tree Topology k-ary fat-tree has k pods Each pod contains two layers of k/2 switches Each k-port switch in the lower layer is directly connected to k/2 hosts Each of the remaining k/2 ports is connected to k/2 of the k ports of the aggregation switches (k/2)2 core switches Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 20. Fat-tree Topology k-ary fat-tree has k pods Each pod contains two layers of k/2 switches Each k-port switch in the lower layer is directly connected to k/2 hosts Each of the remaining k/2 ports is connected to k/2 of the k ports of the aggregation switches (k/2)2 core switches Each core switch has one port connected to each of the k pods Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 21. Fat-tree Topology k-ary fat-tree has k pods Each pod contains two layers of k/2 switches Each k-port switch in the lower layer is directly connected to k/2 hosts Each of the remaining k/2 ports is connected to k/2 of the k ports of the aggregation switches (k/2)2 core switches Each core switch has one port connected to each of the k pods The ith port of any core switch is connected to pod i Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 22. Fat-tree Topology k-ary fat-tree has k pods Each pod contains two layers of k/2 switches Each k-port switch in the lower layer is directly connected to k/2 hosts Each of the remaining k/2 ports is connected to k/2 of the k ports of the aggregation switches (k/2)2 core switches Each core switch has one port connected to each of the k pods The ith port of any core switch is connected to pod i A k-ary fat-tree supports k3 /4 hosts Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 7 / 27
  • 23. Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 8 / 27
  • 24. DCell Uses a recursively defined structure to interconnect servers Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
  • 25. DCell Uses a recursively defined structure to interconnect servers Each server connects to different levels of DCells through multiple links Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
  • 26. DCell Uses a recursively defined structure to interconnect servers Each server connects to different levels of DCells through multiple links High-level DCells are built recursively from many low-level ones Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
  • 27. DCell Uses a recursively defined structure to interconnect servers Each server connects to different levels of DCells through multiple links High-level DCells are built recursively from many low-level ones Fault tolerant as there is no single point of failure Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 9 / 27
  • 28. Structure Uses servers with multiple network ports and mini-switches to construct its recursive structure Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
  • 29. Structure Uses servers with multiple network ports and mini-switches to construct its recursive structure DCell0 is the building block to construct larger DCells Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
  • 30. Structure Uses servers with multiple network ports and mini-switches to construct its recursive structure DCell0 is the building block to construct larger DCells Consists of n servers and a mini-switch Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
  • 31. Structure Uses servers with multiple network ports and mini-switches to construct its recursive structure DCell0 is the building block to construct larger DCells Consists of n servers and a mini-switch High-level DCells are built recursively from many low-level ones Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
  • 32. Structure Uses servers with multiple network ports and mini-switches to construct its recursive structure DCell0 is the building block to construct larger DCells Consists of n servers and a mini-switch High-level DCells are built recursively from many low-level ones DCell1 constructed using n +1 DCell0s Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
  • 33. Structure Uses servers with multiple network ports and mini-switches to construct its recursive structure DCell0 is the building block to construct larger DCells Consists of n servers and a mini-switch High-level DCells are built recursively from many low-level ones DCell1 constructed using n +1 DCell0s The same applies to DCellk Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 10 / 27
  • 34. Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 11 / 27
  • 35. Outline 1 Datacenter Topologies 2 Transport Protocols 3 Network Sharing 4 Wrapping Up Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 12 / 27
  • 36. TCP and UDP TCP: Connection-oriented with reliability, ordering, and congestion control Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 13 / 27
  • 37. TCP and UDP TCP: Connection-oriented with reliability, ordering, and congestion control UDP: Connectionless with no ordering, reliability, or congestion control Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 13 / 27
  • 38. TCP and Datacenter Networks Communication between different nodes is thought of as just opening a TCP connection between them Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
  • 39. TCP and Datacenter Networks Communication between different nodes is thought of as just opening a TCP connection between them Common sockets API Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
  • 40. TCP and Datacenter Networks Communication between different nodes is thought of as just opening a TCP connection between them Common sockets API But TCP was designed for a wide-area network Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
  • 41. TCP and Datacenter Networks Communication between different nodes is thought of as just opening a TCP connection between them Common sockets API But TCP was designed for a wide-area network Clearly, a datacenter is not a wide-area network Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
  • 42. TCP and Datacenter Networks Communication between different nodes is thought of as just opening a TCP connection between them Common sockets API But TCP was designed for a wide-area network Clearly, a datacenter is not a wide-area network Significantly different bandwidth-delay product, round-trip time (RTT), and retransmission timeout (RTO) Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
  • 43. TCP and Datacenter Networks Communication between different nodes is thought of as just opening a TCP connection between them Common sockets API But TCP was designed for a wide-area network Clearly, a datacenter is not a wide-area network Significantly different bandwidth-delay product, round-trip time (RTT), and retransmission timeout (RTO) For example, due to the low RTT, the congestion window for each flow is very small Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
  • 44. TCP and Datacenter Networks Communication between different nodes is thought of as just opening a TCP connection between them Common sockets API But TCP was designed for a wide-area network Clearly, a datacenter is not a wide-area network Significantly different bandwidth-delay product, round-trip time (RTT), and retransmission timeout (RTO) For example, due to the low RTT, the congestion window for each flow is very small As a result, flow recovery through TCP fast retransmit is impossible, leading to poor net throughput Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 14 / 27
  • 45. More problems for TCP In production data centers, due to the widely-varying mix of applications, congestion in the network can last from 10s to 100s of seconds Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
  • 46. More problems for TCP In production data centers, due to the widely-varying mix of applications, congestion in the network can last from 10s to 100s of seconds In commodity switches the buffer pool is shared by all interfaces Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
  • 47. More problems for TCP In production data centers, due to the widely-varying mix of applications, congestion in the network can last from 10s to 100s of seconds In commodity switches the buffer pool is shared by all interfaces If long flows hog the memory, queues can build up for the short flows Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
  • 48. More problems for TCP In production data centers, due to the widely-varying mix of applications, congestion in the network can last from 10s to 100s of seconds In commodity switches the buffer pool is shared by all interfaces If long flows hog the memory, queues can build up for the short flows Many-to-one communication patterns can lead to TCP throughput collapse or incast Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
  • 49. More problems for TCP In production data centers, due to the widely-varying mix of applications, congestion in the network can last from 10s to 100s of seconds In commodity switches the buffer pool is shared by all interfaces If long flows hog the memory, queues can build up for the short flows Many-to-one communication patterns can lead to TCP throughput collapse or incast This can cause overall application throughput to decrease by up to 90% Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
  • 50. More problems for TCP In production data centers, due to the widely-varying mix of applications, congestion in the network can last from 10s to 100s of seconds In commodity switches the buffer pool is shared by all interfaces If long flows hog the memory, queues can build up for the short flows Many-to-one communication patterns can lead to TCP throughput collapse or incast This can cause overall application throughput to decrease by up to 90% In virtualized environments, the time sharing of resources increases the latency faced by the VMs Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
  • 51. More problems for TCP In production data centers, due to the widely-varying mix of applications, congestion in the network can last from 10s to 100s of seconds In commodity switches the buffer pool is shared by all interfaces If long flows hog the memory, queues can build up for the short flows Many-to-one communication patterns can lead to TCP throughput collapse or incast This can cause overall application throughput to decrease by up to 90% In virtualized environments, the time sharing of resources increases the latency faced by the VMs This latency can be orders of magnitude higher than the RTT between hosts inside a datacenter, leading to slow progress of TCP connections Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 15 / 27
  • 52. Reaction Some large-scale deployments have abandoned TCP altogether Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
  • 53. Reaction Some large-scale deployments have abandoned TCP altogether For instance, Facebook now uses a custom UDP transport Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
  • 54. Reaction Some large-scale deployments have abandoned TCP altogether For instance, Facebook now uses a custom UDP transport It might be a “kitchen-sink” solution but it is sub-optimal in a datacenter environment Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
  • 55. Reaction Some large-scale deployments have abandoned TCP altogether For instance, Facebook now uses a custom UDP transport It might be a “kitchen-sink” solution but it is sub-optimal in a datacenter environment Over the years, a number of alternatives have been proposed Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 16 / 27
  • 56. Datacenter TCP (DCTCP) Uses Explicit Congestion Notifications (ECN) from switches to perform active queue management-based congestion control Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
  • 57. Datacenter TCP (DCTCP) Uses Explicit Congestion Notifications (ECN) from switches to perform active queue management-based congestion control Switches set the congestion experienced flag in packets whenever the buffer occupancy exceeds a small threshold Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
  • 58. Datacenter TCP (DCTCP) Uses Explicit Congestion Notifications (ECN) from switches to perform active queue management-based congestion control Switches set the congestion experienced flag in packets whenever the buffer occupancy exceeds a small threshold DCTCP uses this information to reduce the size of the window based on a fraction of the marked packets Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
  • 59. Datacenter TCP (DCTCP) Uses Explicit Congestion Notifications (ECN) from switches to perform active queue management-based congestion control Switches set the congestion experienced flag in packets whenever the buffer occupancy exceeds a small threshold DCTCP uses this information to reduce the size of the window based on a fraction of the marked packets Enables it to react quickly to queue build and avoid buffer pressure Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 17 / 27
  • 60. Multipath TCP (MPTCP) Establishes multiple subflows over different paths between a pair of end-hosts Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
  • 61. Multipath TCP (MPTCP) Establishes multiple subflows over different paths between a pair of end-hosts These subflows operate under a single TCP connection Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
  • 62. Multipath TCP (MPTCP) Establishes multiple subflows over different paths between a pair of end-hosts These subflows operate under a single TCP connection The fraction of the total congestion window for each flow is determined by its speed Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
  • 63. Multipath TCP (MPTCP) Establishes multiple subflows over different paths between a pair of end-hosts These subflows operate under a single TCP connection The fraction of the total congestion window for each flow is determined by its speed Moves traffic away from the most congested paths Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 18 / 27
  • 64. tcpcrypt Backwards compatible enhancement to TCP that aims to efficiently and transparently provide encrypted communication to applications Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
  • 65. tcpcrypt Backwards compatible enhancement to TCP that aims to efficiently and transparently provide encrypted communication to applications Uses a custom key exchange protocol that leverages the TCP options field Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
  • 66. tcpcrypt Backwards compatible enhancement to TCP that aims to efficiently and transparently provide encrypted communication to applications Uses a custom key exchange protocol that leverages the TCP options field Like SSL, to reduce the cost of connection setup for short-lived flows, it enables cryptographic state from one TCP connection to bootstrap subsequent ones Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
  • 67. tcpcrypt Backwards compatible enhancement to TCP that aims to efficiently and transparently provide encrypted communication to applications Uses a custom key exchange protocol that leverages the TCP options field Like SSL, to reduce the cost of connection setup for short-lived flows, it enables cryptographic state from one TCP connection to bootstrap subsequent ones Applications can also be made aware of the presence of tcpcrypt to negate redundant encryption Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 19 / 27
  • 68. Deadline-Driven Delivery (D3 ) Targets applications with distributed workflow and latency targets Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 20 / 27
  • 69. Deadline-Driven Delivery (D3 ) Targets applications with distributed workflow and latency targets Such applications associate a deadline with each network flow and the flow is only useful if the deadline is met Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 20 / 27
  • 70. Deadline-Driven Delivery (D3 ) Targets applications with distributed workflow and latency targets Such applications associate a deadline with each network flow and the flow is only useful if the deadline is met Applications expose flow deadline and size information which is exploited by end hosts to request rates from routers along the data path Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 20 / 27
  • 71. Outline 1 Datacenter Topologies 2 Transport Protocols 3 Network Sharing 4 Wrapping Up Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 21 / 27
  • 72. Introduction Network resources are shared amongst the tenants, which can lead to contention and other undesired behaviour Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
  • 73. Introduction Network resources are shared amongst the tenants, which can lead to contention and other undesired behaviour Network performance isolation between tenants can be an important tool for: Minimizing disruption from legitimate tenants that run network-intensive workloads Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
  • 74. Introduction Network resources are shared amongst the tenants, which can lead to contention and other undesired behaviour Network performance isolation between tenants can be an important tool for: Minimizing disruption from legitimate tenants that run network-intensive workloads Protecting against malicious tenants that launch DoS attacks Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
  • 75. Introduction Network resources are shared amongst the tenants, which can lead to contention and other undesired behaviour Network performance isolation between tenants can be an important tool for: Minimizing disruption from legitimate tenants that run network-intensive workloads Protecting against malicious tenants that launch DoS attacks The standard methodology to ensure isolation is to use VLANs Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 22 / 27
  • 76. Virtual LAN Acts like an ordinary LAN but end-hosts do no necessarily have to be physically connected to the same segment Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
  • 77. Virtual LAN Acts like an ordinary LAN but end-hosts do no necessarily have to be physically connected to the same segment Nodes are grouped together by the VLAN Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
  • 78. Virtual LAN Acts like an ordinary LAN but end-hosts do no necessarily have to be physically connected to the same segment Nodes are grouped together by the VLAN Broadcasts can also be sent within the same VLAN Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
  • 79. Virtual LAN Acts like an ordinary LAN but end-hosts do no necessarily have to be physically connected to the same segment Nodes are grouped together by the VLAN Broadcasts can also be sent within the same VLAN VLAN membership information is inserted into Ethernet frames Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 23 / 27
  • 80. Rate-limiting End-hosts In Xen the network bandwidth available to each domU can be rate limited Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 24 / 27
  • 81. Rate-limiting End-hosts In Xen the network bandwidth available to each domU can be rate limited Can be used to implement basic QoS Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 24 / 27
  • 82. Rate-limiting End-hosts In Xen the network bandwidth available to each domU can be rate limited Can be used to implement basic QoS The virtual interface is simply rate-limited Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 24 / 27
  • 83. Outline 1 Datacenter Topologies 2 Transport Protocols 3 Network Sharing 4 Wrapping Up Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 25 / 27
  • 84. The End In reverse order: 1 Cloud stacks be used to turn clusters and datacenters into private and public clouds Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
  • 85. The End In reverse order: 1 Cloud stacks be used to turn clusters and datacenters into private and public clouds 2 Virtualization of computation, storage, and networking can allow many tenants to co-exist Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
  • 86. The End In reverse order: 1 Cloud stacks be used to turn clusters and datacenters into private and public clouds 2 Virtualization of computation, storage, and networking can allow many tenants to co-exist 3 Most data does not fit the relational model and is more suited for NoSQL stores Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
  • 87. The End In reverse order: 1 Cloud stacks be used to turn clusters and datacenters into private and public clouds 2 Virtualization of computation, storage, and networking can allow many tenants to co-exist 3 Most data does not fit the relational model and is more suited for NoSQL stores 4 Data-intensive, task-parallel frameworks abstract away the details of distribution, work allocation, sychronization, concurreny, and communication; Perfect match for the cloud Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
  • 88. The End In reverse order: 1 Cloud stacks be used to turn clusters and datacenters into private and public clouds 2 Virtualization of computation, storage, and networking can allow many tenants to co-exist 3 Most data does not fit the relational model and is more suited for NoSQL stores 4 Data-intensive, task-parallel frameworks abstract away the details of distribution, work allocation, sychronization, concurreny, and communication; Perfect match for the cloud 5 The future is Big Data and Cloud Computing! Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 26 / 27
  • 89. References 1 Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A scalable, commodity data center network architecture. In Proceedings of the ACM SIGCOMM 2008 conference on Data communication (SIGCOMM ’08). ACM, New York, NY, USA, 63-74. 2 Chuanxiong Guo, Haitao Wu, Kun Tan, Lei Shi, Yongguang Zhang, and Songwu Lu. 2008. Dcell: a scalable and fault-tolerant network structure for data centers. In Proceedings of the ACM SIGCOMM 2008 conference on Data communication (SIGCOMM ’08). ACM, New York, NY, USA, 75-86. Zubair Nabi 15: Datacenter Design and Networking April 21, 2013 27 / 27