This document discusses vNUMA support in the Xen hypervisor. It motivates vNUMA by explaining how cross-NUMA node memory access is expensive and how vNUMA can avoid this. It describes the history and current status of vNUMA in Xen, the design which involves providing guest VMs with virtual NUMA topology information, and challenges with features like ballooning. Preliminary benchmarks show performance improvements with vNUMA enabled in PV guests. Future work includes improving vNUMA support and addressing compatibility issues.
3. Agenda Motives History and status Design Problems Benchmark Future Work
Agenda
I Motives
I History and status
I Design
I Problems
I Preliminary benchmark results
I Future work
Chicago { August 18, 2014 vNUMA in Xen 2 / 19
4. Agenda Motives History and status Design Problems Benchmark Future Work
Motives
I Cross NUMA node memory access is expensive
I Need to avoid cross node memory access
I Xen is NUMA aware
I NUMA aware scheduling
I NUMA aware guest memory placement
I Operating system like Linux is NUMA aware
I NUMA aware scheduling
I NUMA aware memory allocation / migration
I The missing bits
I Memory layout information
I CPU topology
Chicago { August 18, 2014 vNUMA in Xen 3 / 19
5. Agenda Motives History and status Design Problems Benchmark Future Work
History and status
I PV vNUMA presented in Xen Summit 2010 by Dulloor Rao
http://slidesha.re/1AXsFbu
I HVM vNUMA patches posted by Andre Przywara circa 2010
I Elena U
6. mtseva has been working on upstreamable PV
vNUMA since 2013
Chicago { August 18, 2014 vNUMA in Xen 4 / 19
7. Agenda Motives History and status Design Problems Benchmark Future Work
Design: PV and PVH
I Toolstack puts enlightenment information in hypervisor
I Guest memory allocation in accordance with enlightenment
information
I Guest retrieves enlightenment information via hypercall during
boot up
Chicago { August 18, 2014 vNUMA in Xen 5 / 19
8. Agenda Motives History and status Design Problems Benchmark Future Work
Design: PV and PVH
Enlightenment information structure
struct vnuma_info
{
nr_vnodes;
vdistance[nr_vnodes * nr_vnodes];
vcpu_to_vnode[nr_vnodes];
vnode_to_pnode[nr_vnodes];
vmemrange[nr_vnodes];
}
Chicago { August 18, 2014 vNUMA in Xen 6 / 19
9. Agenda Motives History and status Design Problems Benchmark Future Work
Design: HVM
I Toolstack puts enlightenment information in hypervisor
I Toolstack arranges ACPI tables
I Guest memory allocation in accordance with enlightenment
information
I Guest retrieves layout information via ACPI tables during boot
up
Chicago { August 18, 2014 vNUMA in Xen 7 / 19
10. Agenda Motives History and status Design Problems Benchmark Future Work
Problems: vNUMA and other features
PV PVH HVM
Ballooning Y Y* N
PoD N/A ? N
Chicago { August 18, 2014 vNUMA in Xen 8 / 19
11. Agenda Motives History and status Design Problems Benchmark Future Work
Problems: CPU topology
Chicago { August 18, 2014 vNUMA in Xen 9 / 19
12. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark
I Host
I 2 sockets, 12 PCPUs, HT disabled
I 36GB RAM, 2 NUMA nodes
I NUMA balancing enabled
I Guest
I 12 VCPUs
I 16GB RAM, 2 virtual NUMA nodes
I vnodes mapped to dierent pnodes, vcpu pinned to pnode
I NUMA balancing enabled
I Benchmarks to run
I Autonuma
I SPECJBB
I STREAM
Chicago { August 18, 2014 vNUMA in Xen 10 / 19
13. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark: Autonuma
Chicago { August 18, 2014 vNUMA in Xen 11 / 19
14. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark: Autonuma
Chicago { August 18, 2014 vNUMA in Xen 12 / 19
15. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark: Autonuma
Chicago { August 18, 2014 vNUMA in Xen 13 / 19
16. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark: SPECJBB
Chicago { August 18, 2014 vNUMA in Xen 14 / 19
17. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark: SPECJBB
Chicago { August 18, 2014 vNUMA in Xen 15 / 19
18. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark: STREAM
Chicago { August 18, 2014 vNUMA in Xen 16 / 19
19. Agenda Motives History and status Design Problems Benchmark Future Work
Benchmark: Conclusion
I vNUMA improves performance for PV guest
I vNUMA has weird result in SPECJBB for PVH guest, but
other two benchmarks have good results
Chicago { August 18, 2014 vNUMA in Xen 17 / 19
20. Agenda Motives History and status Design Problems Benchmark Future Work
Future Work
I basic vNUMA support for all guest types
I Dom0 vNUMA
I address vNUMA compatibility issues with PoD and ballooning
for HVM guest
I address performance issue for PVH
Chicago { August 18, 2014 vNUMA in Xen 18 / 19