Crea il tuo assistente AI con lo Stregatto (open source python framework)
Multicore I/O Processors In Virtual Data Centers
1. 5th Annual
pp
Application of Multicore I/O
Processors in
Virtualized Data Centers
Nabil Damouny
Rolf Neugebauer
ESC – Multicore Expo
San Jose, CA
April 27 2010
27,
2. 5th Annual
Outline
Networking Market Dynamics
Cloud Computing & the Virtualized Data Center
The Need for an Intelligent I/O Coprocessor
I/O Processing in Virtualized Data centers
1. SW-based (Bridge & vSwitch)
2. I/O Gateway
3. Virtual Ethernet Port Aggregation (VEPA)
4. Server-based
I/O Coprocessor Requirements
Meeting the I/O Coprocessor Challenge in Virtualized Data Centers
Heterogeneous Multicore Architecture
Netronome’s Network Flow Processors and Acceleration Cards
Summary and Conclusion.
Data center virtualization is not complete until the
I/O subsystem is also virtualized.
y
2 2
ESC Silicon Valley – April, 2010 2
3. 5th Annual
About Netronome
• Fabless semiconductor company, developing Network Flow Processing
solutions for high-performance, programmable, L2-L7 applications
• Network coprocessors for x86 designs
• Most complex processing per packet than any other architecture
• Best in class performance per watt
• Unmatched integration with x86 CPUs
• Family of products including processors, acceleration cards, development
tools, software libraries and professional services
Intel Agreements Summary
• Founded in 2003 • IXP28XX Technology License
• Solid background in networking, communications, security,
voice and video applications, high-performance computing • SDK Software License
• Comprised of networking and silicon veterans • HDK Hardware License
• S l and Marketing
Sales d M k ti
• Global Presence • QPI Technology License
• Boston, Massachusetts; Santa Clara, California; Pittsburgh,
Pennsylvania; Cambridge, United Kingdom; Shenzhen, China;
Penang, Malaysia
3 3
ESC Silicon Valley – April, 2010 3
4. 5th Annual
Networking Market Dynamics
Eventually, every packet Market Drivers
from every flow
of communications Application
services will Awareness
Integrated Email, Web,
be intelligently Multimedia
Content
Security Inspection
processed. VPN, SSL, Spam,
Voice. Video, Data,
Anti-Virus, IDS/IPS,
Executables
Firewall
Intelligent
Increasing Networking Device
Bandwidth
B d idth Switching, Routing,
S it hi R ti
Virtualization
Millions of packets WiMax, 3GPP LTE,
Security Blades & Multicore, Multi-OS
and flows at 10GigE
Appliances, Data Multi-app, Multi-I/O
and beyond
Center Servers
Source: Morgan Stanley
Increasing Bandwidth, Greater Security Requirements and the need for
Application and Content-aware Networking are Driving the Evolution to
Intelligent Networking (L2-L7) from Today’s Simpler (L2-L3 only) Networks.
4 4
ESC Silicon Valley – April, 2010 4
5. 5th Annual
Unified Computing in
Virtualized Data centers .…
Requires Intelligent Networking
Unified Computing: The convergence of computing, networking, and storage in a
virtualized environment
Applies to the enterprise (private or internal) and service providers
Environment: Uncorrelated high I/O data rates
Networking
Web servers, especially virtualized servers
Unified Computing - combination of servers and networking
Requirements for high-performance intelligent networking
I/O coprocessing for multicore IA/x86 to scale applications
Intelligent flow-based switching for inter-VM communications
Manage complex high performance networking interfaces
The advent of many VMs and the need for IOV creates a new set of
requirements that mandates a more intelligent approach for managing I/O.
5 5
ESC Silicon Valley – April, 2010 5
6. 5th Annual
Cloud Computing …
Definition & Services
Cloud Computing Defined:
IT-related capabilities are provided “as a service” using Internet
technologies to multiple external customers.
P blic Clo ds
Public Clouds
Private Clouds
Types of services available in Cloud Computing:
Software-as-a-service: Software applications delivered over the
Web
Infrastructure- as-a-service: Remotely accessible server and
storage capacity
Platform- as-a-service: compute-and-software platform that lets
developers build and deploy Web applications on a hosted
infrastructure
infrastructure.
Cloud computing technologies play a crucial role in allowing companies to scale
their data center infrastructure to meet performance and TCO requirements.
6 6
ESC Silicon Valley – April, 2010 6
7. 5th Annual
The Need for an I/O Coprocessor
… In the Virtualized data Center
Efficient delivery of data to VMs at high rates (20+ Gbs)
Requires intelligent IOV solution.
Just L2+ processing is not enough
VLANs, ACLs, etc only cover the base
Stateful load-balancing requires flow-awareness
Clouds are hostile environments:
Stateful firewalls, IPS/IDS, deep packet inspection capabilities
Multicore x86 CPUs
show poor packet processing performance
A unsuitable f h dli millions of stateful fl
Are it bl for handling illi f t t f l flows
Have high power consumption
Introduce an intelligent I/O-Coprocessor to assist x86 Multicore CPUs
7 7
ESC Silicon Valley – April, 2010 7
8. 5th Annual
IDC … on I/O
Virtualization
“If I/O is not sufficient, then it could limit all the gains brought
about by the virtualization process”
process
I/O subsystem needs to deliver peak throughput and lower
latency to the VMs and to the applications they host.
As the VM density increases, most customers are scaling I/O
capacity by installing more adapters.
IOV is simply the abstraction of the logical details of I/O from the
physical, essentially to separate the upper-layer protocols from
the physical connection or transport.
If I/O is not sufficient, then it could limit all the gains
brought about by the virtualization process
8 8
ESC Silicon Valley – April, 2010 8
9. 5th Annual
I/O Coprocessor in a Virtualized
Heterogeneous Multicore Architecture
Multicore CPU Multicore CPU
VM1 VM2 VM3 VMn VM1 VM2 VM3 VMn
OS OS OS OS OS OS OS OS
VNIC VNIC VNIC VNIC VNIC VNIC VNIC VNIC
x86
Chipset Control plane
PCIe Gen2
Data plane
IOV
10GE 10GE
I/O Coprocessor
C
High-speed
Serial interface
Interlaken * Future
9 9
ESC Silicon Valley – April, 2010 9
10. 5th Annual
I/O Coprocessor Requirements in a
Heterogeneous Multicore Architecture
Addressing the Inter-VM Switching and I/O Challenge
Inter-chip access
•Demultiplexing and classification
ore
Multico
• TCP offload
ffl d
x866
• Host offload for burdensome
I/O, security, DPI functions
IOV
• Zero copy, big block transfers to
Flow Processor
multiple cores, VMs or endpoints
Multicore
• Full I/O virtualization with Intel
M
w
VTd
• Programmable egress traffic
management
Heterogeneous Multicore Processing Solutions are >4x
performance of (Multicore x86 + standard NIC).
p ( )
ESC Silicon Valley – April, 2010 10
10 10
11. 5th Annual
Challenges in
Virtualized Data Centers
Rack of single core Many virtual machines and cores
servers and switches in one server
5 years ago
What was a rack of servers five years
ago is now a single server including
networking (switch, IPS, FW..)
g( )
2004 2009
11
Many cores results in 10’s of VMs and network I/O challenge.
11 11
ESC Silicon Valley – April, 2010 11
12. 5th Annual
IEEE 802.1 Addressing Ethernet
Virtualization in data Center
Current IEEE 802.1Q Bridges
Do not allow packet to be sent back to same port within same VLAN
D not have visibility into identity of virtual VM within physical stations
Do t h i ibilit i t id tit f i t l VMs ithi h i l t ti
Extensions to Bridge and End Station behaviors needed to support
virtualization
IEEE 802.1Qbg EVB (Edge Virtual Bridging), VEB/VEPA (Virtual Ethernet
Q ( ) / (
Bridge / Virtual Ethernet Port Aggregation) & 802.1Qbh Bridge Port
Extension (PE)
Address management issues created by the explosion in VMs in data
centers – sharing access to network through embedded bridge
Discuss methods to offload policy, security, and management processing
from virtual switches on NICs and blade servers, to physical Ethernet
switches
Managing Network I/O and Inter-VM Switching will Require
Various Implementation Alternatives
p
12 12
13. 5th Annual
OpenFlow Switching / vSwitch
OpenFlow Switching includes:
Flow Tables used to implement packet processing
OpenFlow protocol used to manipulate the flow entries
entries.
Enables acceleration of stateful security functions:
Application VM with associated security VM (e.g. FW, IPS, anti-virus).
Network traffic will be classified and transit the security VM p
y prior to being
g
allowed to reach the application VM.
If new flow has been “blessed” pass packets straight to App VM.
Flow based policies for white/black lists (not just L2)
Software-based virtual switches will have difficulty coping with:
Large numbers of flows per second;
Many packets per second, i.e. high throughput at small packet sizes;
AAssuring l
i low l t
latency.
Network Flow Processors architecture fits well with OpenFlow.
15 13
ESC Silicon Valley – April, 2010 13
14. 5th Annual
1A. Software-Based Switching
(Bridge) in Virtual Server
Software virtual switch
VMWare, Xen & Linux Bridge
(initially had no
ACL’s, VLAN’s
ACL’ VLAN’ support) t)
VMWare and Xen put
switches as software
modules in their VMM - but
they lacked key features, and
were slow!
13 14
14
15. 5th Annual
1B. Enhanced Software-Based
Switching (vSwitch) in Virtual Server
Cisco Nexus 1000V
(ACLs, VLANs, IOS)
for VMWare;
OpenVSwitch (flow
based) for XenServer
But with added functionality
the performance reduces
Example: hugely - what happens if FW
Cisco Nexus N1000
and IPS are added?
15
Good Solution for low-performance systems. High Latency
low performance
14 15
ESC Silicon Valley – April, 2010 15
16. 5th Annual
2. I/O Gateway
Delivers Three Key Functions:
• In‐rack server communications switch
• replaces top‐of‐rack Ethernet switch
l t f k Eth t it h
• 10/20Gbps PCIe fabric
• Centralized enclosure for I/O adapters used
by servers in the rack Aprius
Source:
• shared (network, storage) I/O Virtensys
Note: Xsigo Next I/O,
Xsigo,
use similar concepts
• assigned (specialty accelerators)
• Virtualized I/O configuration
New approach using PCIe or Infiniband interconnects,
and security functions within gateway
16 16
ESC Silicon Valley – April, 2010 16
17. 5th Annual
3. Virtual Ethernet Port
Aggregation (VEPA)
Offloads policy, security and
management processing from virtual
switches on NICs and blade servers,
into h i l Ethernet switches (
i t physical Eth t it h (e.g.
ToR switch)
IEEE VEPA is an extension to
physical and virtual switching
VEPA allows VMs to use external
switches to access features like
ACLs, policies, VLAN assignments.
All Inter-VM traffic has to traverse the physical network
infrastructure. Additional security features, load balancers
etc. implemented in external appliances
17 17
ESC Silicon Valley – April, 2010 17
18. 5th Annual
4. Moving Switching Into The Server
Switch moved
from IA/x86 into
Netronome
NFP-32xx
Moving the switching to Netronome
based Coprocessor leads to release of
cycles on IA and increased application
performance Adding IPS or FW is no
performance.
problem!
18
Server based
Server-based NIC or LoM - Use Existing Wiring. Security processing in the Server
18
18 18
19. 5th Annual
Intelligent I/O Sharing Alternatives; Summary
Addressing Inter-VM Switching and the Network I/O Challenge
Software- Server-based
based switch
I/O Gateway VEPA
switch
Very good – except
Performance
P f Poor Very good for inter-VM Very good
switching
Poor
Power Wastes IA Cycles
Good Good Good
Unclear – standard
U l t d d
Network or server Network admin Depends who owns the
Management admin
if I/O Gateway
owns switch
implements a switch
Centralized. Adding Centralized. Adding
Software-based Centralized
y
Security Adds Latency
security increases security increases
+Distributed
cost and latency cost and latency
Depends on Medium –
Flexibility High
architecture standard switch
High
Good
Reliability Low Good Good
Distributed
Di t ib t d
<VEPA – card is same
<VEPA: Card in Low, but higher for
Less costly but as CNA in ToR. But
Cost wastes IA cycles
server <CNA & ToR intelligent ToR
VEPA much simpler,
Sw part of Gateway switches
cheaper
19 19
ESC Silicon Valley – April, 2010 19
20. 5th Annual
Performance of SR-IOV NIC,
Linux Bridge and a vSwitch
vSwitches require more packet processing & hence drop packets
much earlier.
20 20
ESC Silicon Valley – April, 2010 20
21. 5th Annual
Performance of SR-IOV NIC, an
old style Bridge and a vSwitch
vSwitches Provide more Flexibility and Functionality, but…
Drop Packets Earlier; Consumes more CPU Cycles
21 21
ESC Silicon Valley – April, 2010 21
22. 5th Annual
Performance & CPU Load of SR-IOV
NIC, Linux Bridge and a vSwitch
Combining Flexibility of vSwitches with Performance of SR-IOV NICs
Requires an Intelligent I/O Coprocessor
22 22
ESC Silicon Valley – April, 2010 22
23. 5th Annual
Requirements for I/O
Coprocessor
Intelligent, Stateful, Flow-based switching
Flow based
Integrated IOV
Load balancing
Integrated security
Glue-less interface to CPU subsystem
Glue less
Netronome “Netrok Flow Processor” is an Intelligent I/O Coprocessor
23 23
ESC Silicon Valley – April, 2010 23
25. 5th Annual
Summary and
Conclusion
Inter-VM switching and intelligent I/O device sharing are integral
part of data center virtualization
There are many implementations alternatives
Heterogeneous architecture addresses this challenge
I/O Coprocessor Complements multicore x86 with packet processing
performance; handling millions of stateful flows; Lowering power
consumption
Netronome’s NFP-32xx processor family integrates inter-VM
switching and I/O virtualization capabilities
Netronome’s PCIe card family integrates the intelligent, programmable,
y g g ,p g ,
flow-based, Network Card functionality with IOV, for the data center.
Heterogeneous architecture (Network Flow Processing + Multicore x86)
addresses the need for inter-VM switching and intelligent I/O sharing.
g g g
25 25
ESC Silicon Valley – April, 2010 25
27. 5th Annual
Session Info &
Abstract
https://www.cmpevents.com/ESCw10/a.asp?option=C&V=11&SessID=10701
Application of Multicore I/O Processors in Virtualized Data Centers
Speaker: Nabil Damouny (Senior Director, Marketing, Netronome Systems), Rolf
Neugebauer (Staff Software Engineer, Netronome Systems)
Date/Time: (April 27, 2010) 8:30am — 9:15am
Formats:
Audience level: Intermediate
Presentation Abstract
This presentation will discuss the applications of integrated multicore processors,
optimized for networking I/O applications, in virtualized data centers. Data centers
are increasingly being built with multicore virtualized servers. As the number of
cores in the server increases, the number of VMs goes up at an even faster pace.
i th i th b f VM t f t
These servers need to have access to high-performance network I/O, resulting in
the requirement to implement I/O sharing in a virtualized, intelligent way. In
addition, a mechanism for high-performance inter-VM switching will also be
needed. Flow-based solutions, such as flow classification, routing and load
balancing,
balancing supporting in excess of 8M flows, are effective ways to address the
flows
above challenges.
Track: Multicore Expo – Networking & Telecom
27
28. 5th Annual
NFP-32xx Integrates Flow-
Based L2 Functions
For Inter-VM Switching
• Flow Classification
• Switching between physical networking ports
• Switching between virtual NICs, without host intervention
• Switching between any physical port and any virtual port
• Stateful flow-based switching
VM1 VM2 VM3 VMn
C1 C2 C3 Cn
CPU (Host)
( )
VNIC VNIC VNIC VNIC
NFE
Rx Tx
Ethernet NFP-32xx Supports
Switch
Rx Tx > 8 Million Flows
Interconnection Link
Interconnection Link
28
ESC Silicon Valley – April, 2010 28
29. 5th Annual
I/O Virtualization (IOV)
Requirements
Support multiple virtual functions (VFs) over PCIe
Lower cost, lower power
Dynamically assign VFs to different VMs
Support multiple NIC functions: Crypto, PCAP, etc…
pp p yp , ,
Capability to pin I/O device to specific CPU core/VM
Enable consolidation and isolation
Flow-based load balancing to x86 multicore CPUs
Hi h
Higher-performance at l
f t lower power
Intelligent I/O virtualization is required in multicore CPU designs
PCI-SIG introduced SR-IOV standards for this purpose
29
ESC Silicon Valley – April, 2010 29
30. 5th Annual
The Need for Intelligent
I/O Virtualization
• Use commodity multicore hardware
• Virtualization for:
• C
Consolidation
lid ti
• Move “legacy” applications & OSs
to multicore
• Isolation
so at o
• I/O devices need to be shared
• Load balance/direct traffic to VMs
• Pin VMs to cores
• Direct traffic to cores/VMs
• I l t d i access f
Isolate device from VM
VMs
A good IOV solutions provides all of the above!
30
ESC Silicon Valley – April, 2010 30