SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
VEE’16
04/02/2016
Shoot4U: Using VMM Assists to Optimize
TLB Operations on Preempted vCPUs
Jiannan Ouyang, John Lange
University of Pittsburgh
Haoqiang Zheng
VMware Inc.
CPU Consolidation in the Cloud
2
CPU Consolidation: multiple virtual CPUs (vCPUs) share the
same physical CPU (pCPU).
Motivation: Improve datacenter utilization.
Figure 1. Average activity distribution of a typical shared Google clusters
including Online Services, each containing over 20,000 servers, over a
period of 3 months [Barroso 13].
Problems with Preempted vCPUs
3
PP P P
A
B
A
B
B
A
A
B
A vCPU of VM-A B vCPU ofVM-B
Preempted
Running
pCPUP
Performance problems:
Busy-waiting based kernel synchronization operations
—  Lock Holder Preemption problem
—  Lock Waiter Preemption problem
—  TLB Shootdown Preemption problem
Lock Holder Preemption
4
Lock holder preemption [Uhlig 04, Friebel 08]
—  A preempted vCPU is holding a spinlock
—  Causes dramatically longer lock waiting time
—  context switch latency + CPU shares allocated to other vCPUs
SchedulingTechniques
—  co-scheduling, relaxed co-scheduling [VMware 10]
—  Adaptive co-scheduling [Weng HPDC11]
—  Balanced scheduling [Sukwong EuroSys11]
—  Demand-based coordinated scheduling [Kim ASPLOS13]
HardwareAssistedTechniques
—  Intel Pause-Loop Exiting (PLE) [Riel 11]
Lock Waiter Preemption [Ouyang VEE13]
5
Linux uses a FIFO order fair spinlock, named ticket spinlock
i i+1 i+2 i+3
Lock waiter preemption
—  A lock waiter is preempted, and blocks the queue
—  P(waiter preemption) > P(holder preemption)
0 T 2TTimeout: 3T
PreemptableTicket Spinlock
—  Key idea: proportional timeout
TLB Shootdown Preemption
6
KVM Paravirt Remote FlushTLB [kvmtlb 12]
—  VMM maintains vCPU preemption states and shares with the
guest.
—  Use conventional approach if the remote vCPU is running.
—  DeferTLB flush if the remote vCPU is preempted.
—  Cons: preemption state may change after checking.
TLB shootdown IPIs as scheduling heuristics [Kim ASPLOS13]
Shoot4U
—  Goal: eliminate the problem
—  Key idea: invalidate guestTLB entries from theVMM
Contributions
7
—  An analysis of the impact that various low level
synchronization operations have on system benchmark
performance.
—  Shoot4U:A novel virtualizedTLB architecture that ensures
consistently low latencies for synchronizedTLB operations.
—  An evaluation of the performance benefits achieved by
Shoot4U over current state-of-art software and hardware
assisted approaches.
Performance Analysis
8
Overhead of CPU Consolidation
9
0
2
4
6
8
10
12
14
16
18
20
blackscholes
bodytrack
canneal
dedup
ferret
freqmine
raytrace
streamcluster
swaptions
vips
x264
Slowdown
70.6
max ideal slowdown
PARSEC Runtime with co-locatedVM over running alone
(12-coreVMs, measured on Linux/KVM, with PLE disabled)
CPU Usage Profiling (perf)
10
0
10
20
30
40
50
60
70
80
90
100
blackscholes
bodytrack
canneal
dedup
ferret
freqmine
raytrace
streamcluster
swaptions
vips
x264
blackscholes
bodytrack
canneal
dedup
ferret
freqmine
raytrace
streamcluster
swaptions
vips
x264
Percentage(%)
1VM 2VM
k:lock k:tlb k:other u:*
CDF of TLB Shootdown Latency (ktap)
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
100
101
102
103
104
105
106
CumulativePercent
Latency (us)
1VM
2VM
How TLB Shootdown Works in VMs
12
—  TLB (Translation Lookaside Buffer)
—  a per-core hardware cache for page table translation results
—  TLB coherence is managed by the OS
—  TLB shootdown operations: IPI + invlpg
—  LinuxTLB shootdown is busy-waiting based
VMM
PP P P
A A B A
1. vIPIs
3. pIPIs
2. trap 4. inject virtual interrupts
5. vCPU is scheduled
(TLB Shootdown Preemption)
B B A B
6. invalidation & ACK
Shoot4U
13
Shoot4U
14
Observation: modern hardware allows theVMM to invalidate guest
TLB entries (e.g. Intel invpid)
Key idea: invalidate guestTLB entries from theVMM
—  Tell theVMM whatTLB entries and vCPUs to invalidate (hypervall)
—  TheVMM invalidates and returns, no interrupt injection and waiting
2. pIPIs
1.  hypercall
<vcpu set, addr range> 3. invalidation andACKVMM
PP P P
A A B A
B B A B
Implementation
15
KVM/Linux 3.16, ~200 LOC (~50 LOC guest side)
—  https://github.com/ouyangjn/shoot4u	
Guest
—  use hypercall forTLB shootdowns
VMM
—  hypercall handler: vCPU set => pCPU set, and send IPIs
—  IPI handler: invalidate guestTLB entries with invpid
kvm_hypercall3(unsigned	long	KVM_HC_SHOOT4U,	unsigned	long	vcpu_bitmap,		
				unsigned	long	start_addr,	unsigned	long	end_addr);	
Shoot4U API
Evaluation
16
Dual-socket Dell R450 server
—  6-core Intel “Ivy-Bridge” Xeon processors with hyperthreading
—  24 GB RAM split across two NUMA domains.
—  CentOS 7 (Linux 3.16)
Virtual Machines
—  12 vCPUs, 4G RAM on the same socket
—  Fedora 19 (Linux 4.0)
—  VM1: PARSEC Benchmark Suite,VM2 sysbench CPU test
Schemes
—  baseline: unmodified Linux kernel
—  kvmtlb [kvmtlb 12]
—  Shoot4U
—  Pause-Loop Exiting (PLE) [Riel 11]
—  PreemptableTicket Spinlock (PMT) [OuyangVEE ’13]
TLB Shootdown Latency (Cycles)
17
Order of magnitude lower latency
TLB Shootdown Latency (CDF)
18
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
100
101
102
103
104
105
106
CumulativePercent
Latency (us)
shoot4u-1VM
shoot4u-2VM
kvmtlb-1VM
kvmtlb-2VM
baseline-1VM
baseline-2VM
Parsec Performance (2-VMs)
19
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
blackscholes
bodytrackcannealdedup
ferret
freqmineraytracestreamcluster
swaptionsvips
x264
NormalizedExecutionTime
baseline
ple
pmt
pmt+kvmtlb
pmt+shoot4u
ple+pmt+shoot4u
Revisiting Performance Slowdown
20
0
2
4
6
8
10
12
14
16
18
20
blackscholes
bodytrack
canneal
dedup
ferret
freqmine
raytrace
streamcluster
swaptions
vips
x264
Slowdown
70.6
baseline ple+pmt+shoot4u
Revisiting CPU Usage Profiling
21
0
10
20
30
40
50
60
70
80
90
100
blackscholes
bodytrack
canneal
dedup
ferret
freqmine
raytrace
streamcluster
swaptions
vips
x264
blackscholes
bodytrack
canneal
dedup
ferret
freqmine
raytrace
streamcluster
swaptions
vips
x264
Percentage(%) baseline 2VM ple+pmt+shoot4u 2VM
k:lock k:tlb k:other u:*
Conclusions
22
—  We conducted a set of experiments in order to provide a
breakdown of overheads caused by preempted virtual CPU cores,
showing thatTLB operations can have a significant impact on
performance with certain workloads.
—  We Shoot4U, an optimization forTLB shootdown operations that
internalizesTLB shootdowns in theVMM and so no longer
requires the involvement of a guest’s vCPUs.
—  Our evaluation demonstrates the effectiveness of our approach,
and illustrates how under certain workloads our approach is
dramatically better than state-of-the-art techniques.
23
https://github.com/ouyangjn/shoot4u
Q & A
24
Jiannan Ouyang
Ph.D. Candidate
University of Pittsburgh
ouyang@cs.pitt.edu	
http://www.cs.pitt.edu/~ouyang/	
	
The Prognostic Lab
University of Pittsburgh
http://www.prognosticlab.org	
	
Pisces Co-Kernel
Kitten Lightweight Kernel
PalaciosVMM
References
25
—  [Ouyang 13] Jiannan Ouyang and John R. Lange. PreemptableTicket
Spinlocks: Improving Consolidated Performance in the Cloud. In Proc.
9thACM SIGPLAN/SIGOPS International Conference onVirtual
Execution Environments (VEE), 2013.
—  [Uhlig 04]Volkmar Uhlig, Joshua LeVasseur, Espen Skoglund, and
Uwe Dannowski.Towards scalable multiprocessor virtual
machines. In Proceedings of the 3rd conference onVirtual
Machine ResearchAndTechnology Symposium -Volume 3,
VM’04, 2004.
—  [Friebel 08]Thomas Friebel. How to deal with lock-holder
preemption. Presented at the Xen Summit NorthAmerica, 2008.
—  [Kim ASPLOS’13] H. Kim, S. Kim, J. Jeong, J. Lee, and S. Maeng.
Demand- based Coordinated Scheduling for SMPVMs. In Proc.
Inter- national Conference on Architectural Support for Programming
Languages and Operating Systems (ASPLOS), 2013.
26
—  [VMware 10]VMware(r) vSphere(tm):The cpu scheduler in
vmware esx(r) 4.1.Technical report,VMware, Inc, 2010.
—  [Barroso 13] L.A. Barroso, J. Clidaras, and U. Holzle.The
Datacenter as a Computer:An Introduction to the Design of
Warehouse- Scale Machines. Synthesis Lectures on Computer Architec-
ture, 2013.
—  [Weng HPDC’11] C.Weng, Q. Liu, L.Yu, and M. Li. Dynamic
Adaptive Scheduling forVirtual Machines. In Proc.20th
International Symposium on High Performance Parallel and Distributed
Computing (HPDC), 2011.
—  [Sukwong EuroSys’11] O. Sukwong and H. S. Kim. Is Co-
schedulingToo Expensive for SMPVMs? In Proc.6th European
Conference on Com- puter Systems (EuroSys), 2011.
27
—  [Riel 11] R. v. Riel. Directed yield for pause loop exiting, 2011.
URL http://lwn.net/Articles/424960/.
—  [kvmtlb 12] KVM Paravirt Remote FlushTLB. https://lwn.net/
Articles/500188/.

Contenu connexe

Tendances

USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFBrendan Gregg
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Hajime Tazaki
 
Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Hajime Tazaki
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioHajime Tazaki
 
Linux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic KernelLinux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic KernelHajime Tazaki
 
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine LearningTrip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine LearningRenaldas Zioma
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsKernel TLV
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing PerformanceBrendan Gregg
 
Accelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAccelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAlexander Shalimov
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)Kirill Tsym
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
 
Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Hajime Tazaki
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Hajime Tazaki
 
Ovs perf
Ovs perfOvs perf
Ovs perfMadhu c
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...Jim St. Leger
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.Naoto MATSUMOTO
 

Tendances (20)

USENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
 
mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
Linux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic KernelLinux Kernel Library - Reusing Monolithic Kernel
Linux Kernel Library - Reusing Monolithic Kernel
 
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine LearningTrip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine Learning
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
 
Accelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAccelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDK
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013
 
Ovs perf
Ovs perfOvs perf
Ovs perf
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
 
Xen in Linux 3.x (or PVOPS)
Xen in Linux 3.x (or PVOPS)Xen in Linux 3.x (or PVOPS)
Xen in Linux 3.x (or PVOPS)
 
Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.
 

En vedette

Preemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloudPreemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloudJiannan Ouyang, PhD
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSStephan Cadene
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversJez Halford
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumThe Linux Foundation
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingLinaro
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEric Van Hensbergen
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2Outlyer
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_AnalysisBuland Singh
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionHemanth Venkatesh
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelDavidlohr Bueso
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterAaron Joue
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespacesLocaweb
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationLinaro
 
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...David Evans
 
SFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mindSFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mindLinaro
 
Debugging linux kernel tools and techniques
Debugging linux kernel tools and  techniquesDebugging linux kernel tools and  techniques
Debugging linux kernel tools and techniquesSatpal Parmar
 

En vedette (20)

Preemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloudPreemptable ticket spinlocks: improving consolidated performance in the cloud
Preemptable ticket spinlocks: improving consolidated performance in the cloud
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
 
Cache profiling on ARM Linux
Cache profiling on ARM LinuxCache profiling on ARM Linux
Cache profiling on ARM Linux
 
Docker by demo
Docker by demoDocker by demo
Docker by demo
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + Quantum
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP Scheduling
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS Interference
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysis
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emption
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux Kernel
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespaces
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM Virtualization
 
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
Smarter Scheduling (Priorities, Preemptive Priority Scheduling, Lottery and S...
 
SFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mindSFO15-BFO2: Reducing the arm linux kernel size without losing your mind
SFO15-BFO2: Reducing the arm linux kernel size without losing your mind
 
Debugging linux kernel tools and techniques
Debugging linux kernel tools and  techniquesDebugging linux kernel tools and  techniques
Debugging linux kernel tools and techniques
 

Similaire à Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs

Demand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsDemand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsHwanju Kim
 
Comparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization TechnologyComparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization TechnologyBenoit des Ligneris
 
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...Sangwook Kim
 
AMD_11th_Intl_SoC_Conf_UCI_Irvine
AMD_11th_Intl_SoC_Conf_UCI_IrvineAMD_11th_Intl_SoC_Conf_UCI_Irvine
AMD_11th_Intl_SoC_Conf_UCI_IrvinePankaj Singh
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cAjith Narayanan
 
OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...
OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...
OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...OpenNebula Project
 
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaMonitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaNETWAYS
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Michelle Holley
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackBoden Russell
 
Using Embedded Linux for Infrastructure Systems
Using Embedded Linux for Infrastructure SystemsUsing Embedded Linux for Infrastructure Systems
Using Embedded Linux for Infrastructure SystemsYoshitake Kobayashi
 
KVM Enhancements for OPNFV
KVM Enhancements for OPNFVKVM Enhancements for OPNFV
KVM Enhancements for OPNFVOPNFV
 
Network performance test plan_v0.3
Network performance test plan_v0.3Network performance test plan_v0.3
Network performance test plan_v0.3David Pasek
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
CPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop InfrastructureCPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop InfrastructureHwanju Kim
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetesTed Jung
 
20141111_SOS3_Gallo
20141111_SOS3_Gallo20141111_SOS3_Gallo
20141111_SOS3_GalloAndrea Gallo
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloudinside-BigData.com
 
Real-Time Load Balancing of an Interactive Mutliplayer Game Server
Real-Time Load Balancing of an Interactive Mutliplayer Game ServerReal-Time Load Balancing of an Interactive Mutliplayer Game Server
Real-Time Load Balancing of an Interactive Mutliplayer Game ServerJames Munro
 

Similaire à Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs (20)

Demand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMsDemand-Based Coordinated Scheduling for SMP VMs
Demand-Based Coordinated Scheduling for SMP VMs
 
Comparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization TechnologyComparison of Open Source Virtualization Technology
Comparison of Open Source Virtualization Technology
 
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
Virtual Asymmetric Multiprocessor for Interactive Performance of Consolidated...
 
AMD_11th_Intl_SoC_Conf_UCI_Irvine
AMD_11th_Intl_SoC_Conf_UCI_IrvineAMD_11th_Intl_SoC_Conf_UCI_Irvine
AMD_11th_Intl_SoC_Conf_UCI_Irvine
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12c
 
OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...
OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...
OpenNebulaConf 2013 - Monitoring Large-scale Cloud Infrastructures with OpenN...
 
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaMonitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebula
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStack
 
Using Embedded Linux for Infrastructure Systems
Using Embedded Linux for Infrastructure SystemsUsing Embedded Linux for Infrastructure Systems
Using Embedded Linux for Infrastructure Systems
 
KVM Enhancements for OPNFV
KVM Enhancements for OPNFVKVM Enhancements for OPNFV
KVM Enhancements for OPNFV
 
Network performance test plan_v0.3
Network performance test plan_v0.3Network performance test plan_v0.3
Network performance test plan_v0.3
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
CPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop InfrastructureCPU Scheduling for Virtual Desktop Infrastructure
CPU Scheduling for Virtual Desktop Infrastructure
 
Container & kubernetes
Container & kubernetesContainer & kubernetes
Container & kubernetes
 
SRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdfSRE NL MeetUp - eBPF.pdf
SRE NL MeetUp - eBPF.pdf
 
20141111_SOS3_Gallo
20141111_SOS3_Gallo20141111_SOS3_Gallo
20141111_SOS3_Gallo
 
Inside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable CloudInside Microsoft's FPGA-Based Configurable Cloud
Inside Microsoft's FPGA-Based Configurable Cloud
 
Real-Time Load Balancing of an Interactive Mutliplayer Game Server
Real-Time Load Balancing of an Interactive Mutliplayer Game ServerReal-Time Load Balancing of an Interactive Mutliplayer Game Server
Real-Time Load Balancing of an Interactive Mutliplayer Game Server
 

Dernier

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 

Dernier (20)

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 

Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs

  • 1. VEE’16 04/02/2016 Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs Jiannan Ouyang, John Lange University of Pittsburgh Haoqiang Zheng VMware Inc.
  • 2. CPU Consolidation in the Cloud 2 CPU Consolidation: multiple virtual CPUs (vCPUs) share the same physical CPU (pCPU). Motivation: Improve datacenter utilization. Figure 1. Average activity distribution of a typical shared Google clusters including Online Services, each containing over 20,000 servers, over a period of 3 months [Barroso 13].
  • 3. Problems with Preempted vCPUs 3 PP P P A B A B B A A B A vCPU of VM-A B vCPU ofVM-B Preempted Running pCPUP Performance problems: Busy-waiting based kernel synchronization operations —  Lock Holder Preemption problem —  Lock Waiter Preemption problem —  TLB Shootdown Preemption problem
  • 4. Lock Holder Preemption 4 Lock holder preemption [Uhlig 04, Friebel 08] —  A preempted vCPU is holding a spinlock —  Causes dramatically longer lock waiting time —  context switch latency + CPU shares allocated to other vCPUs SchedulingTechniques —  co-scheduling, relaxed co-scheduling [VMware 10] —  Adaptive co-scheduling [Weng HPDC11] —  Balanced scheduling [Sukwong EuroSys11] —  Demand-based coordinated scheduling [Kim ASPLOS13] HardwareAssistedTechniques —  Intel Pause-Loop Exiting (PLE) [Riel 11]
  • 5. Lock Waiter Preemption [Ouyang VEE13] 5 Linux uses a FIFO order fair spinlock, named ticket spinlock i i+1 i+2 i+3 Lock waiter preemption —  A lock waiter is preempted, and blocks the queue —  P(waiter preemption) > P(holder preemption) 0 T 2TTimeout: 3T PreemptableTicket Spinlock —  Key idea: proportional timeout
  • 6. TLB Shootdown Preemption 6 KVM Paravirt Remote FlushTLB [kvmtlb 12] —  VMM maintains vCPU preemption states and shares with the guest. —  Use conventional approach if the remote vCPU is running. —  DeferTLB flush if the remote vCPU is preempted. —  Cons: preemption state may change after checking. TLB shootdown IPIs as scheduling heuristics [Kim ASPLOS13] Shoot4U —  Goal: eliminate the problem —  Key idea: invalidate guestTLB entries from theVMM
  • 7. Contributions 7 —  An analysis of the impact that various low level synchronization operations have on system benchmark performance. —  Shoot4U:A novel virtualizedTLB architecture that ensures consistently low latencies for synchronizedTLB operations. —  An evaluation of the performance benefits achieved by Shoot4U over current state-of-art software and hardware assisted approaches.
  • 9. Overhead of CPU Consolidation 9 0 2 4 6 8 10 12 14 16 18 20 blackscholes bodytrack canneal dedup ferret freqmine raytrace streamcluster swaptions vips x264 Slowdown 70.6 max ideal slowdown PARSEC Runtime with co-locatedVM over running alone (12-coreVMs, measured on Linux/KVM, with PLE disabled)
  • 10. CPU Usage Profiling (perf) 10 0 10 20 30 40 50 60 70 80 90 100 blackscholes bodytrack canneal dedup ferret freqmine raytrace streamcluster swaptions vips x264 blackscholes bodytrack canneal dedup ferret freqmine raytrace streamcluster swaptions vips x264 Percentage(%) 1VM 2VM k:lock k:tlb k:other u:*
  • 11. CDF of TLB Shootdown Latency (ktap) 11 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 100 101 102 103 104 105 106 CumulativePercent Latency (us) 1VM 2VM
  • 12. How TLB Shootdown Works in VMs 12 —  TLB (Translation Lookaside Buffer) —  a per-core hardware cache for page table translation results —  TLB coherence is managed by the OS —  TLB shootdown operations: IPI + invlpg —  LinuxTLB shootdown is busy-waiting based VMM PP P P A A B A 1. vIPIs 3. pIPIs 2. trap 4. inject virtual interrupts 5. vCPU is scheduled (TLB Shootdown Preemption) B B A B 6. invalidation & ACK
  • 14. Shoot4U 14 Observation: modern hardware allows theVMM to invalidate guest TLB entries (e.g. Intel invpid) Key idea: invalidate guestTLB entries from theVMM —  Tell theVMM whatTLB entries and vCPUs to invalidate (hypervall) —  TheVMM invalidates and returns, no interrupt injection and waiting 2. pIPIs 1.  hypercall <vcpu set, addr range> 3. invalidation andACKVMM PP P P A A B A B B A B
  • 15. Implementation 15 KVM/Linux 3.16, ~200 LOC (~50 LOC guest side) —  https://github.com/ouyangjn/shoot4u Guest —  use hypercall forTLB shootdowns VMM —  hypercall handler: vCPU set => pCPU set, and send IPIs —  IPI handler: invalidate guestTLB entries with invpid kvm_hypercall3(unsigned long KVM_HC_SHOOT4U, unsigned long vcpu_bitmap, unsigned long start_addr, unsigned long end_addr); Shoot4U API
  • 16. Evaluation 16 Dual-socket Dell R450 server —  6-core Intel “Ivy-Bridge” Xeon processors with hyperthreading —  24 GB RAM split across two NUMA domains. —  CentOS 7 (Linux 3.16) Virtual Machines —  12 vCPUs, 4G RAM on the same socket —  Fedora 19 (Linux 4.0) —  VM1: PARSEC Benchmark Suite,VM2 sysbench CPU test Schemes —  baseline: unmodified Linux kernel —  kvmtlb [kvmtlb 12] —  Shoot4U —  Pause-Loop Exiting (PLE) [Riel 11] —  PreemptableTicket Spinlock (PMT) [OuyangVEE ’13]
  • 17. TLB Shootdown Latency (Cycles) 17 Order of magnitude lower latency
  • 18. TLB Shootdown Latency (CDF) 18 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 100 101 102 103 104 105 106 CumulativePercent Latency (us) shoot4u-1VM shoot4u-2VM kvmtlb-1VM kvmtlb-2VM baseline-1VM baseline-2VM
  • 21. Revisiting CPU Usage Profiling 21 0 10 20 30 40 50 60 70 80 90 100 blackscholes bodytrack canneal dedup ferret freqmine raytrace streamcluster swaptions vips x264 blackscholes bodytrack canneal dedup ferret freqmine raytrace streamcluster swaptions vips x264 Percentage(%) baseline 2VM ple+pmt+shoot4u 2VM k:lock k:tlb k:other u:*
  • 22. Conclusions 22 —  We conducted a set of experiments in order to provide a breakdown of overheads caused by preempted virtual CPU cores, showing thatTLB operations can have a significant impact on performance with certain workloads. —  We Shoot4U, an optimization forTLB shootdown operations that internalizesTLB shootdowns in theVMM and so no longer requires the involvement of a guest’s vCPUs. —  Our evaluation demonstrates the effectiveness of our approach, and illustrates how under certain workloads our approach is dramatically better than state-of-the-art techniques.
  • 24. Q & A 24 Jiannan Ouyang Ph.D. Candidate University of Pittsburgh ouyang@cs.pitt.edu http://www.cs.pitt.edu/~ouyang/ The Prognostic Lab University of Pittsburgh http://www.prognosticlab.org Pisces Co-Kernel Kitten Lightweight Kernel PalaciosVMM
  • 25. References 25 —  [Ouyang 13] Jiannan Ouyang and John R. Lange. PreemptableTicket Spinlocks: Improving Consolidated Performance in the Cloud. In Proc. 9thACM SIGPLAN/SIGOPS International Conference onVirtual Execution Environments (VEE), 2013. —  [Uhlig 04]Volkmar Uhlig, Joshua LeVasseur, Espen Skoglund, and Uwe Dannowski.Towards scalable multiprocessor virtual machines. In Proceedings of the 3rd conference onVirtual Machine ResearchAndTechnology Symposium -Volume 3, VM’04, 2004. —  [Friebel 08]Thomas Friebel. How to deal with lock-holder preemption. Presented at the Xen Summit NorthAmerica, 2008. —  [Kim ASPLOS’13] H. Kim, S. Kim, J. Jeong, J. Lee, and S. Maeng. Demand- based Coordinated Scheduling for SMPVMs. In Proc. Inter- national Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2013.
  • 26. 26 —  [VMware 10]VMware(r) vSphere(tm):The cpu scheduler in vmware esx(r) 4.1.Technical report,VMware, Inc, 2010. —  [Barroso 13] L.A. Barroso, J. Clidaras, and U. Holzle.The Datacenter as a Computer:An Introduction to the Design of Warehouse- Scale Machines. Synthesis Lectures on Computer Architec- ture, 2013. —  [Weng HPDC’11] C.Weng, Q. Liu, L.Yu, and M. Li. Dynamic Adaptive Scheduling forVirtual Machines. In Proc.20th International Symposium on High Performance Parallel and Distributed Computing (HPDC), 2011. —  [Sukwong EuroSys’11] O. Sukwong and H. S. Kim. Is Co- schedulingToo Expensive for SMPVMs? In Proc.6th European Conference on Com- puter Systems (EuroSys), 2011.
  • 27. 27 —  [Riel 11] R. v. Riel. Directed yield for pause loop exiting, 2011. URL http://lwn.net/Articles/424960/. —  [kvmtlb 12] KVM Paravirt Remote FlushTLB. https://lwn.net/ Articles/500188/.