My INSURER PTE LTD - Insurtech Innovation Award 2024
Xen summit 2010 extending xen into embedded
1. Xen Summit 2010
Extending Xen into Embedded
and Communications Workloads
2. Agenda
• Embedded Usage Models
• Virtual Machine Monitor Requirements
• Benchmarking
• Cisco Product Range
• Embedded Development Requirements
• High Availability
2 09.14.05
3. Embedded Usage Models
Robotics
Using Core Micro
Architecture for GUI
IP Media Phones interface with real time
Atom based platforms industrial control.
delivering Internet
connectivity and media
content to continuous
connected devices.
Routing
Xeon Micro
Architecture based
platforms implement
control and data-
plane services on
high end routers.
Unique VMM requirements across all segments
3
4. Virtual Machine Monitor Implementation
Scalability, Flexibility, RAS
Industrial Control requires and Fail Over are a few of
determinism. Performance the vmm requirements in
Critical partition Comm’s appliance
required to host Cell is measured in interrupt
latency (10 usec or lower) environment
phone application,
hypervisor requires
Quality of Service RTOS
(Service) Linux
Microsoft Linux
Critical App GUI
Partition partition RTOS
Shared
Memory vmm
vmm
Thin vmm
Industrial Comm’s Appliance
Media Phone
4
5. Embedded Virtualization - Advantages
Consolidation and Preservation Dataplane Control
Legacy - Proprietary Single
Legacy Legacy
Threaded Operating Systems RTOS RTOS
Linux
Rapid Deployment of new vmm
services VT-d / SRIOV
Core 0 Core 1 Multi-Core
Integrate Development Architecture
rx rx
Environment separate from tx tx
Critical Services PF
10 Gb/s
5
6. Embedded Deployment Requirements
Single Core scheduling
Scheduling control for Guest Quality of
Service
Phone App
Dom0 Application Development
Traffic prioritization to avoid packet loss
requires (soft) Real Time scheduling
Xen
Credit based scheduler research in progress Atom I/O I/O
Consolidated Grant Tables
Consolidate Fast Path with Security
fast path
Intrusion Detection application
Linux io rings Fast Path
Requires efficient mechanism to share Intrusion
ip
packet data with Linux application Dom0 Detection
packet
Forwarding
Grant tables (io rings) maybe an efficient
mechanism to meet performance Xen
requirements (needs to be Lock Free)
Xeon I/O
6
7. Embedded Xen Deployment
120
100
Power Profile of some edge based appliances is
80
cyclical, potential power savings can be substantial
60
(Example Base Station Controller) 40
Data
20
ACPI support generally not supported in Real Time / 0
Proprietary Operating Systems 6am 6pm
120
Hypervisor Power Management could be very useful
100
to control overall power budget
80
Voice
60
“Shelf Manager” Power management research in
40
progress
20
Fast
Fast 0
Fast Path
Dom0 Path
Path 6am 6pm
Shf mgr
Xen Fast
Fast
Fast Path
Dom0 Path
Path
Multi Core
Shf mgr
Intelligent Power
Fast
Xen Fast
Fast Path Management, balances I/O
Dom0 Path
Path
Multi Core latency & throughout
Shf mgr
Xen
Multi Core
7
8. Embedded Xen – Direct Cache Access
memory
DCA - Direct Cache Access delivers data in cache to CPU
ctrl
reduce average memory latency and attempts to
Cache
reduce memory bandwidth
DCA Driver uses get_cpu() to gather APIC_ID, uses
this to configure the DCA enabled NIC device IOH
DCA
static void igb_update_dca(struct igb_q_vector *q_vector)
{ I/O
struct igb_adapter *adapter = q_vector->adapter;
struct e1000_hw *hw = &adapter->hw;
int cpu = get_cpu(); /* Get the current CPU Id*/
if (q_vector->cpu == cpu) Dom0 Guest Guest
goto out_no_update;
Xen
get_cpu() requires to return the valid APIC ID of the CPU CPU
Cache Cache
core where the guest is executing.
8
9. Benchmarking, 10 GbE perspective
A 64B packet can arrive every 67.2ns
In terms of processor cycles : @ 2.53 GHz, a 64B packet arrives every ~201 cycles
Can generate up to 14.88 million Rx and 14.88 million Tx transactions every second
(packets)
Each packet has a 16B descriptor associated with it, that must be written for every
packet that needs to be processed
Mpp/s
16,000,000 The Linux forwarding code
14,000,000 takes ~3000 cycles to process
12,000,000 a packet.
10,000,000
8,000,000
With enhancement we can
6,000,000
reduce the number of cycles
per (64 Byte) packet to ~1350
4,000,000
cycles.
2,000,000
0
64
118
172
226
280
334
388
442
496
550
604
658
712
766
820
874
928
982
1036
1090
1144
1198
1252
1306
1360
1414
1468
Packet Size
9
10. Guest Forwarding Performance
Native Layer 3 Forwarding
Virtualized 2-Port (1 Core, 1 Thread)
Packets per Second (PPS)
Linux Linux
forwarding forwarding
VT-d vmm
Core 0 Core 1 Multi-Core
Architecture
I/O I/O
64
128
256
512
768
1024
1280
1518
Packet Size (bytes)
Single threaded virtualized environments show promising performance:
- Near native performance for small packet sizes
- Native performance for large packet sizes ( >256B ).
Limited performance penalty for consolidation, additional scaling tests
in progress
10
11. Cisco Embedded Product Space Service Provider
Wide range of products in a number
of market segments:
ASR 9000 CRS
Data Center
Voice & Video
UCS Nexus 7000
TelePresence Unified Enterprise
Communications
Security
MDS 9222i (SAN) ASR 1000
Branch
Home Ironport ASA 5500
3900 ISR 2800 ISR
Flip Video Valet
11
12. Embedded Product Environment
Hardware Environment
General Purpose CPUs, SoCs, ASICs, FPGAs, custom processors, ixp, DSPs, …
From large multi-core, multi-blade, multi-chassis systems to small single/dual core devices
Terabit to Gigabit I/O
Software Environment
Multi-OS: IOS, IOS-XE, IOS-XR, NX-OS
Proprietary (legacy), Linux, other …
Single threaded, multi-threaded, pipelined, flow-based, …
Multiple vm models
integrated services platform, distributed/load balancing, HA, control & data
separation, …
Control plane, data plane, management plane, appliance and service engines, …
e.g., routing, data, voice, video, deep packet inspection, firewall, security, etc.
Memory, processor, and I/O bandwidth requirements vary by application
and network device location
12
13. Embedded Development Requirements
We believe that xen is the right choice for an embedded hypervisor
Early support for prototype hardware required: In hypervisor and dom0
Open source xen and linux critical to this effort
It’s the right architecture and feature set for embedded development
RAS
High Availability (HA) for guests
non-disruptive stateful failover, non-disruptive in service software upgrade (ISSU)
Devices
hot pluggable/removable (non-disruptive): shared & dedicated (including sr-iov)
dom0
Separate device driver domains good, but not enough
All domains need to be restartable
Deterministic Performance
QoS control through configuration and scheduling
I/O linearly scalable across cores and vms
Low latency interrupts
13
14. Embedded Development Requirements
Core allocation/Scheduling: vcpu pcpu mapping
(pinned, non-shared): deterministic performance
(pinned, shared), (non-pinned, shared): scheduled
For pv IOS, I/O workload, 64-byte packets, 2 ports, bidirectional, 64-bit xen, NUMA on
(pinned, non-shared), HT off 100%line rate (1Gb) per core
<0.1% time spent in hypervisor
(non-pinned, shared), HT off ~10% decreased throughput
(pinned, non-shared), NUMA- remote, HT off ~8% decreased throughput
(pinned, non-shared), HT on, one on each 1.5x/1.7x (I/O/cpu) increase in
thread on the core throughput (aggregate)
.75x/.85x (I/O/cpu) throughput per
transaction single thread
(pinned, non-shared), HT on, only one Same as (pinned, non-shared), HT off
thread on the core in use
Guest Support
Both pv and hvm (hybrid!)
32-bit & 64-bit
Virtual memory paged and non-paged (single, flat address space)
14
15. Embedded Development Requirements
Debug and Performance Monitoring
multi-guest, simultaneous
32-bit & 64-bit guests (minimum is gdbsx for both pv & hvm)
Performance monitoring tools (access to PMU data - xenoprofile & others)
Required in the field as well as during development
Trusted Systems: Secure Products
Trusted boot, TPM, Intel TXT/AMD-V
Trusted guests, sandboxed 3rd party guests, anti-counterfeiting, …
Manageable
Power Management
Especially at the edge, branch, and consumer devices
Policy based, managed by hypervisor
Cases where guest should not be automatically power managed
“carrier class” xen Development Environment
Support for rapid prototyping
Support for production product environment
15
16. HA Requirements
Rationale
HA & ISSU features available on many platforms across our product space today
Cannot go to market without support in certain product spaces
Software fails much more often than hardware
Software-only HA/ISSU at much lower cost very attractive
Natural fit on multi-core devices
High Availability (HA)
Active-Standby: stateful, “hot” Standby
Failure of Active causes non-disruptive failover to Standby
Reconciliation required on switchover
Standby progresses through state machine to Active state
I/O devices always belong to Active and switch to [new] Active without loss of state
Packet loss ok on switchover – higher level protocols recover
Downstream end of device connection must not see a “failure”
Switchover must take place in < 1 sec.
In Service Software Upgrade (ISSU)
Built on HA infrastructure
Automated software upgrade (or downgrade)
Non disruptive: Fallback if required or requested
16
17. HA Requirements
What is needed:
Reliable fast failure detection mechanism
Current: hardware uses interrupt pin; backup is heart-beat mechanism (slow)
Need to emulate/implement fast, reliable failure detection mechanism in xen
Failover device transparently from Active to Standby
no loss of [device] state
Packet traffic dropped until Standby transitions to Active
Interrupts
redirected to new Active (old Standby) on failover
interrupts dropped until Standby transitions to Active
[new] Active must be able to address outstanding interrupts without complete reset
Need to be able to run in redundant hardware configuration or on multi-core device
drivers responsible for appropriate reconciliation protocols
Minimize the changes to xen kernel and dom0 code
recovery decisions need to be in the domain of the guest driver
Support for direct assign devices (including sr-iov) and shared devices
Non shared memory solution for DMA target memory preferred
requires ability to either pre-program and switch or reprogram and switch on failover
17
18. “carrier class” xen Development Environment
Needs to support 2 different Environments:
Rapid prototyping and development of new services
Work often requires unstable branch, pre-release/prototype hardware
Straight forward, and accessible to the non xen expert
Interest is in getting the prototype/product up and running quickly rather
than
xen infrastructure
Developer threads, blogs, etc. not a substitute for up-to-date
documentation
Product decisions (go/no go) based on prototype results
Failure/missed deadlines will eliminate a prototype as a possible solution
Corporate networks/labs behind firewalls, use proxies
Doesn’t work well with current git-based source control
Requires exceptions to corporate IT policy
Production product
Uses stable release
Controlled access to performance & debug tools in customer environment
Documentation required in field as well
Auditing requires ability to reproduce image bit-for-bit from local build
18
19. Summary
• Embedded market provides for a great growth
opportunity
• Deployment requires some unique features
• Xen is well positioned but requires support for RAS
features, debug and “Carrier Class” Release
19