SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
gScale : Improve vGPU Scalability
Using Dynamic Resource Sharing
Aug 2016
Pei Zhang Pei.zhang@intel.com
Kevin Tian Kevin.tian@intel.com
Xiao Zheng Xiao.zheng@intel.com
2
Agenda
• vGPU Scalability of GVT-g
• gScale Design for Doubled vGPU Density
• Evaluation
• Summary
vGPU Scalability in GVT-g
4
GVT-g Introduction
• Full GPU virtualization solution with mediated pass-through approach
• Open source implementation for Xen/KVM (aka XenGT/KVMGT)
• Support a rich span of Intel® Processor Graphics
• Available in https://01.org/igvt-g
Near to native
GPU performance
Full GPU capabilities
in VM
Flexible VMs sharing
(Up to 7VMs)+ +
>= 2X higher density!
gScale
5
Processor Graphics: Components
Display
Engine
GPU
Render
Engine
GGTT
System Memory
Global Graphics
Virtual Memory
PPGTTPPGTTPPGTT
Graphics Virtual
Memory
Graphics Virtual
Memory
Per-Process Graphics
Virtual Memory
Registers/MMIO
CPU
Aperture
6
• Parallel access from multiple engines due to split vCPU/vGPU scheduling
Shared Global Graphics Memory in GVT-g
Display
Engine
Render
Engine
Global Graphics
Virtual Memory
CPU
Aperture
By guest gfx drivers
thru
vCPU scheduling
By guest GPU commands
thru
vGPU scheduling
Upon
Foreground/background
Display switch
7
Static Partitioning of Global Graphics Memory
0 4GB1GB
Global
Graphics Memory
CPU Visible Memory CPU Invisible Memory
vGPU1
vGPU2
PCI MMIO Aperture
- Minimal 128MB visible/384MB invisible per vGPU
- Means up to 7vGPUs possible (besides reserved for Dom0)
- Some BIOS may support a smaller aperture which means more limitation
gScale Design for Doubled vGPU Density
9
gScale: per-VM Global Graphics Memory
VM1
View
VM2
View Host View
Ballooned graphic memory
VM1
View
VM2
View Host View
Switch between per-VM GGTT table
Shared GGTT space Per-VM GGTT space
Key challenge is to remove parallel accesses!
10
3 Components in Parallel Access to GGTT
• Render engine access
• Display engine access
• CPU tiled/non-tiled memory access
11
Render Engine Accesses
• At anytime, only one vGPU’s graphics memory is accessible by render engine
Controlled by vGPU scheduler
• Dynamically switch per-VM GGTT table at vGPU context switch
Render Engine
Global Graphics Memory
per-VM Global GTT
VM2 System MemoryVM1 System Memory
Part of vGPU context switch
12
Display Engine Accesses
• Display engine accesses is restricted to the Dom0 reserved region
Alias mapping of guest framebuffer into reserved graphics memory for Dom0
Display Engine
Global Graphics Memory
per-VM Global GTT
VM2 System MemoryVM1 System Memory
Dom0 reserved region
13
Tiled/non-Tiled and Aperture Memory
• What is tiled memory
• Tiled memory – aperture access
Linear surface Tiled surface
14
CPU Accesses to Non-Tiled Memory
VM2 System MemoryVM1 System Memory
Global Graphics Memory
per-VM Global GTT
EPT1 EPT2
vAperture1 vAperture2
Remove
use of aperture
Bypass aperture and
directly access system
memory.
Mem access is coherent
between CPU cores/GPU.
Dom0 reserved region
Aperture
15
CPU Access for Tiled Memory
VM2 System MemoryVM1 System Memory
Global Graphics Memory
per-VM Global GTT
EPT1 EPT2
vAperture1 vAperture2
Tiled Memory
Fence
Registers
Dynamic allocation of
fence register and
aperture for tiled memory!
Dom0 reserved region
Aperture
16
Summary: Global Graphic memory access
Display
Engine
Render
Engine
CPU
Aperture
Access Tiled mem:
remap to DOM-0
Reserved region
Only render owner
can access memory
Remap to only accesses
DOM-0 reserved region
Per-VM Global
memory
System Memory
Access None-Tiled mem:
pass through to
system memory directly
GPU
GGTT
Dom0 reserved region
17
Optimization: Split per-VM GGTT into slots
• per-VM GTT switching is based on slots
• Switching for those only slots that shared
between other VMs
* Notes:
1. vGPU0 (for DOM0) has dedicated mem space
slots which won’t be shared by other vGPUs.
Slot0
(DOM0)
Slot1
(DOMu)
Slot2
(DOMu)
High Mem space
vGPU0*1
vGPU1
vGPU2
vGPU3
vGPU4
DOM0 Reserved
region
Slot0
(DOM0)
Slot1
(DOMu)
Low Mem space
…
vGPUn
…
…
Available low/high space slot
Split sample
Host
GGTT space
Per-VMGGTTSpace
Whole per-VM GTT table size is about 2MB
Switching entire GTT table is time consumed.
18
Linux Windows
1. gScale-Basic is the data without memory slot split.
2. Performance data is sampled with 12 VMs.
Context switch performance
- Private GTT table copy
Source: USENIX ATC (2015), gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space
Evaluation
20
Windows Guest: Scalability 3D
performance with 12 VMs could achieves
up to 80% of 1 VM.
Linux Guest: Scalability of 3D performance with 15VMs could achieves
up to 80% of 1 VM.
2D performance is better by burn out GPU power.
Linux VM
Windows VM
Scalability Evaluation
Source: USENIX ATC (2015), gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space
Summary
22
Summary
• gScale breaks graphics memory resource limitation by introducing per-VM GGTT design
• gScale doubles vGPU density in Intel®@ GVT-g with good scalability
• Converged performance of 15 vGPUs reaches 96% of native performance
23
Resource Links
• Project webpage and release: https://01.org/igvt-g
• Project public papers and document: https://01.org/group/2230/documentation-list
• Intel® IDF: GVT-g in Media Cloud: https://01.org/sites/default/files/documentation/sz15_sfts002_100_engf.pdf
• XenGT introduction in summit in 2015: http://events.linuxfoundation.org/sites/events/files/slides/XenGT-
Xen%20Summit-REWRITE%203RD%20v4.pdf
• XenGT introduction in summit in 2014: http://events.linuxfoundation.org/sites/events/files/slides/XenGT-
LinuxCollaborationSummit-final_1.pdf
24
Notices and Disclaimers
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE,
EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS
GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR
SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR
IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR
WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT
OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT
INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
Intel may make changes to specifications and product descriptions at any time, without notice.
All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product
to deviate from published specifications. Current characterized errata are available on request.
No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a
computer with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an
Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more
information, visit http://www.intel.com/technology/security
Intel, Intel logo, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United
States and other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2016 Intel Corporation. All rights reserved.

Contenu connexe

Tendances

Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
Sim Janghoon
 
Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)
Wan Leung Wong
 

Tendances (20)

XPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, Citrix
XPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, CitrixXPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, Citrix
XPDS14 - Xen on ARM: Status and Performance - Stefano Stabellini, Citrix
 
XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...
XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...
XPDDS18: Unleashing the Power of Unikernels with Unikraft - Florian Schmidt, ...
 
XPDDS18: The Evolution of Virtualization in the Arm Architecture - Julien Gra...
XPDDS18: The Evolution of Virtualization in the Arm Architecture - Julien Gra...XPDDS18: The Evolution of Virtualization in the Arm Architecture - Julien Gra...
XPDDS18: The Evolution of Virtualization in the Arm Architecture - Julien Gra...
 
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
XPDDS18: Xenwatch Multithreading - Dongli Zhang, OracleXPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
 
Kernel Recipes 2017 - Build farm again - Willy Tarreau
Kernel Recipes 2017 - Build farm again - Willy TarreauKernel Recipes 2017 - Build farm again - Willy Tarreau
Kernel Recipes 2017 - Build farm again - Willy Tarreau
 
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
XPDDS18: Design and Implementation of Automotive: Virtualization Based on Xen...
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
 
XPDS14: Removing the Xen Linux Upstream Delta of Various Linux Distros - Luis...
XPDS14: Removing the Xen Linux Upstream Delta of Various Linux Distros - Luis...XPDS14: Removing the Xen Linux Upstream Delta of Various Linux Distros - Luis...
XPDS14: Removing the Xen Linux Upstream Delta of Various Linux Distros - Luis...
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
 
XPDS16: Hypervisor-based Security: Vicarious Learning via Introspektioneerin...
XPDS16:  Hypervisor-based Security: Vicarious Learning via Introspektioneerin...XPDS16:  Hypervisor-based Security: Vicarious Learning via Introspektioneerin...
XPDS16: Hypervisor-based Security: Vicarious Learning via Introspektioneerin...
 
QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?
 
Project ACRN GVT-d introduction and tutorial
Project ACRN GVT-d introduction and tutorialProject ACRN GVT-d introduction and tutorial
Project ACRN GVT-d introduction and tutorial
 
Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)Virtualization - Kernel Virtual Machine (KVM)
Virtualization - Kernel Virtual Machine (KVM)
 
Project ACRN hypervisor introduction
Project ACRN hypervisor introduction Project ACRN hypervisor introduction
Project ACRN hypervisor introduction
 
Project ACRN Yocto Project meta-acrn layer introduction
Project ACRN Yocto Project meta-acrn layer introductionProject ACRN Yocto Project meta-acrn layer introduction
Project ACRN Yocto Project meta-acrn layer introduction
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
Improving Xen idle power efficiency
Improving Xen idle power efficiencyImproving Xen idle power efficiency
Improving Xen idle power efficiency
 
ACRN vMeet-Up EU 2021 - Boot Process and Secure Boot
ACRN vMeet-Up EU 2021 - Boot Process and Secure BootACRN vMeet-Up EU 2021 - Boot Process and Secure Boot
ACRN vMeet-Up EU 2021 - Boot Process and Secure Boot
 
Virtunoid: Breaking out of KVM
Virtunoid: Breaking out of KVMVirtunoid: Breaking out of KVM
Virtunoid: Breaking out of KVM
 
ACRN vMeet-Up EU 2021 - Bridging Orchestrator and Hard Realtime Workload Cons...
ACRN vMeet-Up EU 2021 - Bridging Orchestrator and Hard Realtime Workload Cons...ACRN vMeet-Up EU 2021 - Bridging Orchestrator and Hard Realtime Workload Cons...
ACRN vMeet-Up EU 2021 - Bridging Orchestrator and Hard Realtime Workload Cons...
 

En vedette

OSCON16: Analysis of the Xen code review process: An example of software deve...
OSCON16: Analysis of the Xen code review process: An example of software deve...OSCON16: Analysis of the Xen code review process: An example of software deve...
OSCON16: Analysis of the Xen code review process: An example of software deve...
The Linux Foundation
 

En vedette (20)

XPDS16: Making Migration More Secure - John Shackleton, Adventium Labs
XPDS16: Making Migration More Secure - John Shackleton, Adventium LabsXPDS16: Making Migration More Secure - John Shackleton, Adventium Labs
XPDS16: Making Migration More Secure - John Shackleton, Adventium Labs
 
XPDS16: Xen Orchestra: building a Cloud on top of Xen - Olivier Lambert & Jul...
XPDS16: Xen Orchestra: building a Cloud on top of Xen - Olivier Lambert & Jul...XPDS16: Xen Orchestra: building a Cloud on top of Xen - Olivier Lambert & Jul...
XPDS16: Xen Orchestra: building a Cloud on top of Xen - Olivier Lambert & Jul...
 
XPDS16: libvirt and Tools: What's New and What's Next - James Fehlig, SUSE
XPDS16: libvirt and Tools: What's New and What's Next - James Fehlig, SUSEXPDS16: libvirt and Tools: What's New and What's Next - James Fehlig, SUSE
XPDS16: libvirt and Tools: What's New and What's Next - James Fehlig, SUSE
 
Fosdem17 - Mixed License FOSS Projects
Fosdem17 - Mixed License FOSS ProjectsFosdem17 - Mixed License FOSS Projects
Fosdem17 - Mixed License FOSS Projects
 
Fosdem 17 - Towards a HVM-like Dom0 for Xen
Fosdem 17 - Towards a HVM-like Dom0 for XenFosdem 17 - Towards a HVM-like Dom0 for Xen
Fosdem 17 - Towards a HVM-like Dom0 for Xen
 
XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D...
 XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D... XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D...
XPDS16: Patch review for non-maintainers - George Dunlap, Citrix Systems R&D...
 
XPDS16: Scope and Performance of Credit-2 Scheduler. - Anshul Makkar, Ctirix...
XPDS16:  Scope and Performance of Credit-2 Scheduler. - Anshul Makkar, Ctirix...XPDS16:  Scope and Performance of Credit-2 Scheduler. - Anshul Makkar, Ctirix...
XPDS16: Scope and Performance of Credit-2 Scheduler. - Anshul Makkar, Ctirix...
 
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
XPDS16:  CPUID handling for guests - Andrew Cooper, CitrixXPDS16:  CPUID handling for guests - Andrew Cooper, Citrix
XPDS16: CPUID handling for guests - Andrew Cooper, Citrix
 
XPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARMXPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARM
 
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
XPDS16: High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima &...
 
XPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARM
XPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARMXPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARM
XPDS16: Porting Xen on ARM to a new SOC - Julien Grall, ARM
 
XPDS16: Xen Development Update
XPDS16: Xen Development UpdateXPDS16: Xen Development Update
XPDS16: Xen Development Update
 
OSCON16: Analysis of the Xen code review process: An example of software deve...
OSCON16: Analysis of the Xen code review process: An example of software deve...OSCON16: Analysis of the Xen code review process: An example of software deve...
OSCON16: Analysis of the Xen code review process: An example of software deve...
 
The French Revolution of 1789
The French Revolution of 1789The French Revolution of 1789
The French Revolution of 1789
 
BSDCan 2015: How to Port BSD as a Xen on ARM Guest
BSDCan 2015: How to Port BSD as a Xen on ARM GuestBSDCan 2015: How to Port BSD as a Xen on ARM Guest
BSDCan 2015: How to Port BSD as a Xen on ARM Guest
 
XPDS16: XSM-Flask, current limitations and Ongoing work. - Anshul Makkar, Ct...
XPDS16:  XSM-Flask, current limitations and Ongoing work. - Anshul Makkar, Ct...XPDS16:  XSM-Flask, current limitations and Ongoing work. - Anshul Makkar, Ct...
XPDS16: XSM-Flask, current limitations and Ongoing work. - Anshul Makkar, Ct...
 
Scale14x: Are today's foss security practices robust enough in the cloud era ...
Scale14x: Are today's foss security practices robust enough in the cloud era ...Scale14x: Are today's foss security practices robust enough in the cloud era ...
Scale14x: Are today's foss security practices robust enough in the cloud era ...
 
XPDS16: Xenbedded: Xen-based client virtualization for phones and tablets - ...
XPDS16:  Xenbedded: Xen-based client virtualization for phones and tablets - ...XPDS16:  Xenbedded: Xen-based client virtualization for phones and tablets - ...
XPDS16: Xenbedded: Xen-based client virtualization for phones and tablets - ...
 
Xen summit amd_2010v3
Xen summit amd_2010v3Xen summit amd_2010v3
Xen summit amd_2010v3
 
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
XPDS13: On Paravirualizing TCP - Congestion Control on Xen VMs - Luwei Cheng,...
 

Similaire à XPDS16: Live scalability for vGPU using gScale - Xiao Zheng, Intel

Windows 7 and Windows Server 2008 R2 SP1 Overview
Windows 7 and Windows Server 2008 R2 SP1 OverviewWindows 7 and Windows Server 2008 R2 SP1 Overview
Windows 7 and Windows Server 2008 R2 SP1 Overview
Amit Gatenyo
 
Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...
Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...
Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...
inside-BigData.com
 
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
Edge AI and Vision Alliance
 

Similaire à XPDS16: Live scalability for vGPU using gScale - Xiao Zheng, Intel (20)

XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...
XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...
XPDS13: XenGT - A software based Intel Graphics Virtualization Solution - Hai...
 
Windows 7 and Windows Server 2008 R2 SP1 Overview
Windows 7 and Windows Server 2008 R2 SP1 OverviewWindows 7 and Windows Server 2008 R2 SP1 Overview
Windows 7 and Windows Server 2008 R2 SP1 Overview
 
Optimize Content Delivery with Multi-Access Edge Computing
Optimize Content Delivery with Multi-Access Edge ComputingOptimize Content Delivery with Multi-Access Edge Computing
Optimize Content Delivery with Multi-Access Edge Computing
 
VMworld 2013: A Technical Deep Dive on VMware Horizon View 5.2 Performance an...
VMworld 2013: A Technical Deep Dive on VMware Horizon View 5.2 Performance an...VMworld 2013: A Technical Deep Dive on VMware Horizon View 5.2 Performance an...
VMworld 2013: A Technical Deep Dive on VMware Horizon View 5.2 Performance an...
 
Innovative Solutions for Cloud Gaming, Media, Transcoding, & AI Inferencing
Innovative Solutions for Cloud Gaming, Media, Transcoding, & AI InferencingInnovative Solutions for Cloud Gaming, Media, Transcoding, & AI Inferencing
Innovative Solutions for Cloud Gaming, Media, Transcoding, & AI Inferencing
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
Optimizing Direct X On Multi Core Architectures
Optimizing Direct X On Multi Core ArchitecturesOptimizing Direct X On Multi Core Architectures
Optimizing Direct X On Multi Core Architectures
 
Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...
Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...
Accelerating & Optimizing Machine Learning on VMware vSphere leveraging NVIDI...
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation Webinar
 
The Power of One: Supermicro’s High-Performance Single-Processor Blade Systems
The Power of One: Supermicro’s High-Performance Single-Processor Blade SystemsThe Power of One: Supermicro’s High-Performance Single-Processor Blade Systems
The Power of One: Supermicro’s High-Performance Single-Processor Blade Systems
 
VMworld 2015: Deliver High Performance Desktops with VMware Horizon and NVIDI...
VMworld 2015: Deliver High Performance Desktops with VMware Horizon and NVIDI...VMworld 2015: Deliver High Performance Desktops with VMware Horizon and NVIDI...
VMworld 2015: Deliver High Performance Desktops with VMware Horizon and NVIDI...
 
Cuda meetup presentation 5
Cuda meetup presentation 5Cuda meetup presentation 5
Cuda meetup presentation 5
 
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloudPart 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
 
Building the World's Largest GPU
Building the World's Largest GPUBuilding the World's Largest GPU
Building the World's Largest GPU
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
 
Flowframes
FlowframesFlowframes
Flowframes
 
Vivante GC Nano User Interface (UI) White Paper
Vivante GC Nano User Interface (UI) White PaperVivante GC Nano User Interface (UI) White Paper
Vivante GC Nano User Interface (UI) White Paper
 
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
“Parallelizing Machine Learning Applications in the Cloud with Kubernetes: A ...
 
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
 

Plus de The Linux Foundation

Plus de The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

XPDS16: Live scalability for vGPU using gScale - Xiao Zheng, Intel

  • 1. gScale : Improve vGPU Scalability Using Dynamic Resource Sharing Aug 2016 Pei Zhang Pei.zhang@intel.com Kevin Tian Kevin.tian@intel.com Xiao Zheng Xiao.zheng@intel.com
  • 2. 2 Agenda • vGPU Scalability of GVT-g • gScale Design for Doubled vGPU Density • Evaluation • Summary
  • 4. 4 GVT-g Introduction • Full GPU virtualization solution with mediated pass-through approach • Open source implementation for Xen/KVM (aka XenGT/KVMGT) • Support a rich span of Intel® Processor Graphics • Available in https://01.org/igvt-g Near to native GPU performance Full GPU capabilities in VM Flexible VMs sharing (Up to 7VMs)+ + >= 2X higher density! gScale
  • 5. 5 Processor Graphics: Components Display Engine GPU Render Engine GGTT System Memory Global Graphics Virtual Memory PPGTTPPGTTPPGTT Graphics Virtual Memory Graphics Virtual Memory Per-Process Graphics Virtual Memory Registers/MMIO CPU Aperture
  • 6. 6 • Parallel access from multiple engines due to split vCPU/vGPU scheduling Shared Global Graphics Memory in GVT-g Display Engine Render Engine Global Graphics Virtual Memory CPU Aperture By guest gfx drivers thru vCPU scheduling By guest GPU commands thru vGPU scheduling Upon Foreground/background Display switch
  • 7. 7 Static Partitioning of Global Graphics Memory 0 4GB1GB Global Graphics Memory CPU Visible Memory CPU Invisible Memory vGPU1 vGPU2 PCI MMIO Aperture - Minimal 128MB visible/384MB invisible per vGPU - Means up to 7vGPUs possible (besides reserved for Dom0) - Some BIOS may support a smaller aperture which means more limitation
  • 8. gScale Design for Doubled vGPU Density
  • 9. 9 gScale: per-VM Global Graphics Memory VM1 View VM2 View Host View Ballooned graphic memory VM1 View VM2 View Host View Switch between per-VM GGTT table Shared GGTT space Per-VM GGTT space Key challenge is to remove parallel accesses!
  • 10. 10 3 Components in Parallel Access to GGTT • Render engine access • Display engine access • CPU tiled/non-tiled memory access
  • 11. 11 Render Engine Accesses • At anytime, only one vGPU’s graphics memory is accessible by render engine Controlled by vGPU scheduler • Dynamically switch per-VM GGTT table at vGPU context switch Render Engine Global Graphics Memory per-VM Global GTT VM2 System MemoryVM1 System Memory Part of vGPU context switch
  • 12. 12 Display Engine Accesses • Display engine accesses is restricted to the Dom0 reserved region Alias mapping of guest framebuffer into reserved graphics memory for Dom0 Display Engine Global Graphics Memory per-VM Global GTT VM2 System MemoryVM1 System Memory Dom0 reserved region
  • 13. 13 Tiled/non-Tiled and Aperture Memory • What is tiled memory • Tiled memory – aperture access Linear surface Tiled surface
  • 14. 14 CPU Accesses to Non-Tiled Memory VM2 System MemoryVM1 System Memory Global Graphics Memory per-VM Global GTT EPT1 EPT2 vAperture1 vAperture2 Remove use of aperture Bypass aperture and directly access system memory. Mem access is coherent between CPU cores/GPU. Dom0 reserved region Aperture
  • 15. 15 CPU Access for Tiled Memory VM2 System MemoryVM1 System Memory Global Graphics Memory per-VM Global GTT EPT1 EPT2 vAperture1 vAperture2 Tiled Memory Fence Registers Dynamic allocation of fence register and aperture for tiled memory! Dom0 reserved region Aperture
  • 16. 16 Summary: Global Graphic memory access Display Engine Render Engine CPU Aperture Access Tiled mem: remap to DOM-0 Reserved region Only render owner can access memory Remap to only accesses DOM-0 reserved region Per-VM Global memory System Memory Access None-Tiled mem: pass through to system memory directly GPU GGTT Dom0 reserved region
  • 17. 17 Optimization: Split per-VM GGTT into slots • per-VM GTT switching is based on slots • Switching for those only slots that shared between other VMs * Notes: 1. vGPU0 (for DOM0) has dedicated mem space slots which won’t be shared by other vGPUs. Slot0 (DOM0) Slot1 (DOMu) Slot2 (DOMu) High Mem space vGPU0*1 vGPU1 vGPU2 vGPU3 vGPU4 DOM0 Reserved region Slot0 (DOM0) Slot1 (DOMu) Low Mem space … vGPUn … … Available low/high space slot Split sample Host GGTT space Per-VMGGTTSpace Whole per-VM GTT table size is about 2MB Switching entire GTT table is time consumed.
  • 18. 18 Linux Windows 1. gScale-Basic is the data without memory slot split. 2. Performance data is sampled with 12 VMs. Context switch performance - Private GTT table copy Source: USENIX ATC (2015), gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space
  • 20. 20 Windows Guest: Scalability 3D performance with 12 VMs could achieves up to 80% of 1 VM. Linux Guest: Scalability of 3D performance with 15VMs could achieves up to 80% of 1 VM. 2D performance is better by burn out GPU power. Linux VM Windows VM Scalability Evaluation Source: USENIX ATC (2015), gScale: Scaling up GPU Virtualization with Dynamic Sharing of Graphics Memory Space
  • 22. 22 Summary • gScale breaks graphics memory resource limitation by introducing per-VM GGTT design • gScale doubles vGPU density in Intel®@ GVT-g with good scalability • Converged performance of 15 vGPUs reaches 96% of native performance
  • 23. 23 Resource Links • Project webpage and release: https://01.org/igvt-g • Project public papers and document: https://01.org/group/2230/documentation-list • Intel® IDF: GVT-g in Media Cloud: https://01.org/sites/default/files/documentation/sz15_sfts002_100_engf.pdf • XenGT introduction in summit in 2015: http://events.linuxfoundation.org/sites/events/files/slides/XenGT- Xen%20Summit-REWRITE%203RD%20v4.pdf • XenGT introduction in summit in 2014: http://events.linuxfoundation.org/sites/events/files/slides/XenGT- LinuxCollaborationSummit-final_1.pdf
  • 24. 24 Notices and Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. Intel may make changes to specifications and product descriptions at any time, without notice. All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit http://www.intel.com/technology/security Intel, Intel logo, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others. Copyright © 2016 Intel Corporation. All rights reserved.