9. Telco Cloud Networks
SDN/NFV enable programmability & Cloud enable virtualization of
network resources
High level of flexibility and programmability in individual domains
(mobile core, radio access network and transport network).
Cross-domain programmability and orchestration.
9
10. Telco Network
10
Change the business
model of Telecom
Biggest technological
revolution
Innovative more
quickly
Change the way
network are deployed
Change the way
consumers are enabling
services on the fly
13. What is a Telco Cloud?
A Telco Grade (aka Carrier Grade) Cloud will be a cloud
that can support telco grade applications
Telco Grade requirements:
High availability
High performance (large number of transactions, scalability)
Serviceability
Long life time
Security
Real-time behavior
Standard-compliant HW
13
28. Traditional Network Functions (NFs)
Proprietary devices/boxes for different NFs
Network services rely on different type of appliances
New services into today’s network is becoming increasingly difficult
due to:
Proprietary nature of appliances
Diverse and purpose built hardware
Cost (increase CapEx & OpEx)
Short life cycle of appliances
Lack of space
Energy for middle-boxes
Lack of skilled professionals to integrate services
Recently, NFV proposed to alleviate these problems
29. Introduction– Network Function
Virtualization
NFV proposed by ESTI, an Industry Specification Group (ISG)
Allows to implement NFs (Network Functions) in software.
Virtualized the NFs carried by proprietary HW.
Decoupling NFs from underlying appliances
NFs can runs on commodity hardware (i.e. servers, storage, switches)
Accelerate deployment of new services and NFs
31. Why We Need NFV?
Virtualization: Use network resource without worrying about where it is
physically located, how much it is, how it is organized, etc.
Orchestration: Manage thousands of devices
Programmable: Should be able to change behavior on the fly.
Dynamic Scaling: Should be able to change size, quantity
Visibility: Monitor resources, connectivity
Performance: Optimize network device utilization
Multi-tenancy
Service Integration
Openness: Full choice of Modular plug-ins
44. Basic concepts of SDN
Separate Control plane and Data plane entities.
Network intelligence and state are logically centralized.
The underlying network infrastructure is abstracted from
the applications.
Execute or run Control plane software on general
purpose hardware.
Decouple from specific networking hardware.
Use commodity servers and switches.
Have programmable data planes.
Maintain, control and program data plane state from a
central entity.
An architecture to control not just a networking
device but an entire network.
46. SDN Architecture
Application Layer:
Focusing network services
SW apps communicating with the
control layer
Control-Plane Layer:
Core of SDN
Consists of a centralized controller
Logically maintains a global and
dynamic view
Take requests from application
layer
Manage network devices via
standard protocols
Data-Plane Layer:
Programmable devices
Support standard interfaces
48. Southbound Interface – OpenFlow
Forwarding elements are controlled
by an open interface
OpenFlow has strong support from
industry, research and academia
OpenFlow, standardized information
exchange b/w two planes
Provides controller-switch
interactions
OpenFlow 1.3.0 provides secure
communication using TLS
Communicate with controller via
OpenFlow protocol
49. Control Plane – SDN Controller
Controller provides a programmatic
interface to the network
Give a logically centralized global
view of network
Simplification of policy enforcements
and management
Some concerns:
Control scalability
Centralized vs. Distributed
Reactive vs. Proactive Policies
51. Northbound Interfaces & SDN Applications
Northbound Interface:
A communication interface between Control plane and
Applications
There is no currently accepted standard for northbound
interfaces
Implemented on ad hoc basis for particular application
SDN Applications:
Traffic Engineering
Security
QoS
Routing
Switching
Virtualization
Monitoring
Load Balancing
New Innovations???
53. SDN & NFV Relationship
Concept of NFV originated from SDN
NFV and SDN are complementary.
One does not depend upon the other.
You can do SDN only, NFV only, or SDN and NFV
Both have similar goals but approaches are very different
SDN needs new interfaces, control modules, applications
NFV requires moving network applications from dedicated
hardware to virtual containers on commercial-off-the-shelf
(COTS) hardware
54. NFV vs. SDN
NFV can serve SDN by
virtualizing SDN controller
Implements NFs in software
Reducing CapEx, OpEx, space,
power consumption .
Decouple network function from
proprietary hardware and
achieve agile provisioning and
deployment
SDN serves NFV by providing
programmable connectivity b/w
VNFs to optimize traffic
engineering & steering
Central control and
programmable architecture for
better connectivity
Network abstractions to enable
flexible network control,
configuration and innovation
Decouple control plane from
data plane forwarding to provide
a centralized controller
58. Introduction of ONOS
ONOS (Open Network Operating System) is an open
source SDN OS
Developed in concert with leading SPs, vendors, R&E
network operators and collaborators
Specially targeted: Service Providers and mission critical
networks.
ONOS main goals:
Liberate network application developers from knowing the details
of proprietary hardware
Free from the operational complexities of proprietary interfaces
and protocols
Re-enable innovation to happen for both network hardware and
software
59. Why ONOS??
Several open source controllers (NOX, Beacon, SNAC,
POX, etc)
ONOS will:
Bring carrier grade network (scale, availability, and performance)
to SDN control plane
Enable web style agility
Help SPs to migrate existing networks
Lower SP CapEx & OpEx
60. ONOS Vision for SPs Networks
Enabling SP SDN adoption for carrier-grade service and network innovation
61. Key Elements of ONOS
Modular, Scalable, Resilient with Abstractions
63. Defining Features of ONOS
Distributed Core:
Provides scalability, high availability, and performance
Bring carrier grade features
Run as a cluster is one way that ONOS brings web style agility
Northbound abstraction/APIs:
Include network graph and application intents to ease development
of control, management, and configuration services
Southbound abstraction/APIs:
Enable pluggable southbound protocols for controlling both
OpenFlow and Legacy devices
A key enabler for migration from legacy devices to OpenFlow-based
white boxes
Software Modularity:
Easy to develop, debug, maintain, and upgrade ONOS as a
software system
65. Software Modularity
ONOS software is easy to enhance, change, and maintain
Great care into modularity to make it easy for developers
At the macro level the Northbound and Southbound APIs provide an initial
basis for insulating Applications, Core and Adapters from each other
New applications or new protocol adapters can be added as needed without
each needing to know about the other
Rely heavily on interfaces to serve as contracts for interactions between
different parts of the core
67. SONA (Simplified Overlay Network Architecture)
ONOS applications
Provides OpenStack Neutron ML2 mechanism driver
and L3 service plugin
Optimized tenant network virtualization service
Provisioning an isolated virtual tenant network uses
VXLAN based L2 tunneling or GRE/GENEVE based L3
tunneling with OpenvSwitch (OVS)
Horizontal scalability of a gateway node
67
72. 72
Multicore
Cores share path to
memory
SIMD instructions +
multicore make this an
increasing bottleneck!
73. Performance
The performance (time to solution) on a single computer
can depend on:
Clock speed – How fast the processor is
Floating point unit -- how many operands can be operated on
and what operations can be performed?
Memory latency – what is the delay in accessing data?
Memory bandwidth – how fast can we stream data from memory
I/O to storage – how quickly can we access files?
For parallel computing you can also be limited by the
performance of the interconnection
73
74. Performance (Cont.)
Application performance often described as:
Compute bound
Memory bound
IO bound
Communication bound
For computational science
Most calculations are limited by memory bandwidth
Processor faster than memory access
74
75. 75
HPC Architectures
All Cores have the same access to memory, e.g. a multicore laptop.
Symmetric Multi-processing
77. 77
HPC Architectures (Cont.)
In a real system:
Each node will be a
shared-memory
system
E.g. multicore processor
The network will have
some specific topology
E.g. a regular grid
Distributed/Shared Memory Hybrids
80. Why HPC? (Cont.)
Big data and scientific simulations need greater
computer power.
Single-core processors can not be made that
have enough resources for the simulations
needed.
Making processors with faster clock speeds is difficult
due to cost
Expensive to put huge memory on a single processor
Solution parallel computing – divide up the work
among numerous linked systems
80
81. Generic Parallel Machines
81
Good conceptual model is collection of multicore
laptops.
Connected together by a network
Each laptop is called a
compute node
Each has its own OS and
network
Suppose each node is
quad core
Total system has 20
processor-cores
82. Parallel Computing?
Parallel computing and HPC are intimately
related
Higher performance requires more processor-cores
Understanding of different parallel programming
models allows you to understand how to use
HPC resources efficiently
Also allow you to better understand and critique
work that uses HPC in your research area
82
83. What is HPC?
Leveraging distributed compute resources to solve
complex problems with large datasets
Terabytes to petabytes to zettabytes of data
Results in minutes to hours instead of days or weeks
83
84. Differences from Desktop Computing
Do not log on to compute nodes directly
Submit jobs via batch scheduling systems
Not a GUI-based environment
Share the system with many users
Resources more tightly monitored and controlled
Disk quotas
CPU usage
84
89. Serial Computing
What many programs look
like:
Serial execution, running on one
processor (CPU core) at a time
Overall compute time grows
significantly as individual tasks
get more complicated (long) or if
the number of tasks increases
How can you speed things up?
89
92. High Performance Computing (HPC)?
Benefits greatly from:
CPU speed + homogeneity
Shared filesystems
Fast, expensive networking (e.g.
InfiniBand) and servers co-located
Scheduling: Must wait until all processors
are available, at the same time and for
the full duration
Requires special programming (MP/MPI)
What happens if one core or server fails
or runs slower than the others?
92
93. High Throughput Computing (HTC)?
Scheduling: only need 1 CPU core for each (shorter
wait)
Easier recovery from failure
No special programming required
Number of concurrently running jobs is more important
CPU speed and homogeneity are less important
93
94. High Throughput vs High Performance
HTC
Focus: Large workflows of
numerous, relatively
small, and independent
compute tasks
More important: maximized
number of running tasks
Less important: CPU speed,
homogeneity
94
HPC
Focus: Large workflows of
highly-interdependent sub-
tasks
More important: persistent
access to the fastest cores,
CPU homogeneity, special
coding, shared filesystems,
fast networks
95. Example..
You need to process 48 brain images for each of 168
patients. Each image takes ~1 hour of compute time.
168 patients x 48 images = ~8000 tasks = ~8000 hrs
95
96. Distributed Computing
Use many computers, each running one instance of
our program
Example:
1 laptop (1 core) => 4,000 hours = ~½ year
1 server (~20 cores) => 500 hours = ~3 weeks
1 large job (400 cores) => 20 hours = ~1 day
A whole cluster (8,000 cores) = ~8 hours
96
97. Break Up to Scale Up
Computing tasks that are easy to break up are easy
to scale up.
To truly grow your computing capabilities, you also
need a system appropriate for your computing task!
97
98. What computing resources are available?
A single computer?
A local cluster?
Consider: What kind of cluster is it? Typical clusters tuned for
HPC (large MPI) jobs typically may not be best for HTC
workflows! Do you need even more than that?
Open Science Grid (OSG)
Other
European Grid Infrastructure
Other national and regional grids
Commercial cloud systems (e.g. HTCondoron Amazon)
98
99. Example Local Cluster
UW-Madison’s Center for High Throughput
Computing (CHTC)
Recent CPU hours:
~130 million hrs/year (~15k cores)
~10,000 per user, per day (~400 cores in use)
99
100. Open Science Grid (OSG)
HTC for Everyone
~100 contributors
Past year:
>420 million jobs
>1.5 billion CPU hours
>200 petabytes transferred
Can submit jobs locally, they backfill across
the country-interrupted at any time (but not
too frequent)
http://www.opensciencegrid.org/
100