SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
HKG18-TR08: Upstreaming SVE in
QEMU
Alex Bennée and Richard Henderson
Contents
● Introductions
○ Who we are
○ What QEMU is
● The QEMU Project
○ Development Process
○ Upstreaming Criteria
● SVE Work
○ Native Vectors for TCG
○ ARMv8.2 FP16
○ Remaining Prerequisites
○ SVE instructions
● Demo!
Who are we?
Alex Bennée
Linaro Virtualization Engineer
alex.bennee@linaro.org
IRC: ajb-linaro/stsquad
Richard Henderson
Linaro Virtualization Engineer
richard.henderson@linaro.org
IRC: rth
What is QEMU?
“QEMU is a generic and open source machine emulator and virtualizer”
● Device Emulation for HW virtualization
○ KVM
○ HAX
○ HVF
○ Xen
● Full System Emulation
○ 20 21 guest architectures
○ TCG JIT engine
○ 7 host architectures
QEMU’s Growth
Steady Development
State of QEMU, Paolo Bonzini - KVM Forum 2017
QEMU Development Basics
● Source Code
○ GIT: https://git.qemu.org
○ Master repository with submodules
○ PULL Requests merged by Head Maintainer
■ Sub-maintainers send pull-requests via list
● Discussion
○ Main List: qemu-devel@nongnu.org
○ Sub-lists:
■ ARM specific: qemu-arm@nongnu.org
■ Security: qemu-security@nongnu.org
■ Trivial Patches: qemu-trivial@nongnu.org
○ IRC: #qemu on OFTC network
● Documentation
○ Source Tree: ./docs
○ Wiki: https://wiki.qemu.org/Main_Page
Maintainer/Contributor Retention since 2.0
State of QEMU, Paolo Bonzini - KVM Forum 2017
Upstreaming Criteria
● No Regressions
○ compiles!
○ make check
○ tested
● Acceptable Code
○ Follows CODING_STYLE
○ Signed-off-by: J Hacker <j.hacker@example.com>
○ Reviewed
○ Bisectable
ARM’s Scalable Vector Extensions
● New SIMD extension for Mobile to HPC
● IMPDEF Vector Size (128-2048* bit)
● Multiple Floating Point Precisions
○ N x 64 bit (double precision)
○ 2N x 32 bit (single precision)
○ 4N x 16 bit (half precision)
● For more details see:
○ ARM’s own blog post
○ The Challenge of SVE in QEMU (SFO17)
○ Vectors Meet Virtualization (FOSDEM18)
Tiny Code Generator (TCG)
Vectors in Tiny Code Generator (TCG)
● Vector Sizes
○ Larger registers (128+ bits)
○ Potentially Smaller Lanes (16 bit, 8 bit)
● TCG Op Sizes
○ TCGv_i32
○ TCGv_i64
128 bit Vector Register Layout
Marshalling Costs
● For generated code
○ More TCGops per guest instruction
○ Backend restricted to scalar instructions
○ Optimiser has to be simple and fast
● Helper Functions
○ As above but
○ Creating C ABI frame and procedure call
○ Helpers don’t vectorise
● See also:
Vectoring in on QEMUs TCG Engine, KVM Forum 2017
RFC Patches
● Request For Comment
○ “I’ve had this idea.. what do you think?”
○ Doesn’t have to be complete
● TCG_vec RFC
○ AJB : August 17th 2017
○ 9 files changed, 222 insertions(+), 10 deletions(-)
○ RTH: August 17th 2017
○ 11 files changed, 1817 insertions(+), 94 deletions(-)
Ready for Upstream
● Version 11
○ On List: Jan 26th 2018
○ 25 files changed, 6973 insertions(+), 496 deletions(-)
○ Existing AArch64 guest translator converted to API
○ Existing x86 and AArch64 host generators extended to API
● Pull Request via TCG
○ V1 On List: Feb 7th 2018
○ Failed pre-merge tests
○ V2 On List: Feb 8th 2018
● Merged in master
○ since: Feb 8th 2018
○ in upcoming v2.12
Half Precision Floating Point
● ARMv8.2 FP16
○ Optional feature for v8.2
○ Mandatory for SVE
● 16 bit width
○ 5 bit exponent
○ 10 bit fraction
○ 1 sign bit
● Trade Bandwidth for Accuracy
○ Twice as many calculations than single-precision
○ Often enough accuracy
■ Some graphics calculations
■ Machine learning
● Requires SoftFloat
Softfloat
● Why?
○ Differing number formats (e.g. ARM alternative format)
○ NaN propagation
○ Default NaN
○ Exception Mappings
● Berkeley SoftFloat
○ John R. Hauser, BSD Licensed
○ Tarball release
○ Current release 3c
● QEMU uses
○ Version 2a
○ Numerous tweaks
Softfloat Options
● Initial Discussions
○ Our SoftFloat heavily patched
○ On List: May 2017
○ Patch Existing, Use 3c (replace/incremental), Use glibc?
● RFC: Import SoftFloat3?
○ On List: July 2017
○ A lot of duplication and patching
○ Would upstream take patches?
● RFC: Update 2a ourselves
○ On List: Oct 2017
○ Included WIP FP16 work
○ Also a lot of error prone duplication
Cambridge Sprint
● Nov 2017
○ Pair programming (Alex & Rich)
○ New re-factored SoftFloat
● V1 re-factor softfloat and add fp16 functions
○ On List: Dec 2017
○ 4 files changed, 3066 insertions(+), 3850 deletions(-)
○ No ARM FP16 in series
● V3 re-factor softfloat and add fp16 functions
○ On List: Jan 2018
○ 49 files changed, 2188 insertions(+), 2940 deletions(-)
○ but….
Clearer, slower code
Packed Parts
/*
* Structure holding all of the decomposed parts of a float. The
* exponent is unbiased and the fraction is normalized. All
* calculations are done with a 64 bit fraction and then rounded as
* appropriate for the final format.
*
* Thanks to the packed FloatClass a decent compiler should be able to
* fit the whole structure into registers and avoid using the stack
* for parameter passing.
*/
typedef struct {
uint64_t frac;
int32_t exp;
FloatClass cls;
bool sign;
} FloatParts;
addsub_floats
/*
* Returns the result of adding or subtracting the floating-point
* values `a' and `b'. The operation is performed according to the
* IEC/IEEE Standard for Binary Floating-Point Arithmetic.
*/
float16 __attribute__((flatten)) float16_add(float16 a, float16 b, float_status *status)
{
FloatParts pa = float16_unpack_canonical(a, status);
FloatParts pb = float16_unpack_canonical(b, status);
FloatParts pr = addsub_floats(pa, pb, false, status);
return float16_round_pack_canonical(pr, status);
}
Performance
Other Instruction Patches
● ARMv8.1-SIMD
○ Additional Advanced SIMD instructions
○ Rounding Double Multiply Add/Subtract
○ Forms the base decode path for ARMv8.3-CompNum
● ARMv8.3-CompNum
○ Additional Advanced SIMD instructions
○ Complex add, multiply accumulate
○ Mandatory for SVE
● ARM v8.1 simd + v8.3 complex insns
○ Gated by SoftFloat FP16 work
○ V3 on list: Feb 2018
○ 9 files changed, 1133 insertions(+), 126 deletions(-)
○ Merged: Mar 2018
Other plumbing
● Decoder Script
○ Help generate regular decoder
○ Based on instruction bit patterns
● target/arm: Preparatory work for SVE
○ Expand SIMD registers for SVE
○ Access helpers
○ Migration Support
● linux-user support
○ Add extended signal frame
○ Important for testing with RISU
SVE Patches
● target/arm: Scalable Vector Extension
○ V2 on list: Feb 2018
○ 13 files changed, 11408 insertions(+), 91 deletions(-)
○ 67 patches
○ 99% complete linux-user support
○ No FFR for now
● Call for review
○ Now is the time to review and test
○ On course for 2.13 (est Aug 2018)
○ https://github.com/rth7680/qemu/tree/tgt-arm-sve-8
Demo
● Himeno Benchmark
○ Advanced Centre for Computing and Communication @ Riken
○ Incompressible Fluid Analysis
● LULESH Benchmark
○ Lawrence Livermore National Lab
○ Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)
Potential Future Work
● First Fault/Non-fault loads
○ Simple handling of page boundaries
● SVE System Mode
○ Mostly system registers
● Performance Work
○ Better host register use?
○ Use host FPU instead of SoftFloat?
■ I posted an experimental hack with v4
■ Emilio Cota @UC posted RFC yesterday
Thank You
#HKG18
HKG18 keynotes and videos on: connect.linaro.org
For further information: www.linaro.org

Contenu connexe

Tendances

[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)LINAGORA
 
Not so brief history of Linux Containers
Not so brief history of Linux ContainersNot so brief history of Linux Containers
Not so brief history of Linux ContainersKirill Kolyshkin
 
PyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakesPyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakesViach Kakovskyi
 
BKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle timeBKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle timeLinaro
 
.NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov).NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov)ITCamp
 
KDE Plasma Develop Intro
KDE Plasma Develop IntroKDE Plasma Develop Intro
KDE Plasma Develop Introcsslayer
 
What's missing from upstream kernel containers?
What's missing from upstream kernel containers?What's missing from upstream kernel containers?
What's missing from upstream kernel containers?Kirill Kolyshkin
 
Reliable Asynchronous Web Services on Java EE JOnAS server and Apache CXF
Reliable Asynchronous Web Services on Java EE JOnAS server and Apache CXFReliable Asynchronous Web Services on Java EE JOnAS server and Apache CXF
Reliable Asynchronous Web Services on Java EE JOnAS server and Apache CXFFlorent BENOIT
 
Datafying Bitcoins
Datafying BitcoinsDatafying Bitcoins
Datafying BitcoinsTariq Ahmad
 
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020Taro L. Saito
 
Smartcom's control plane software, a customized version of FreeBSD by Boris A...
Smartcom's control plane software, a customized version of FreeBSD by Boris A...Smartcom's control plane software, a customized version of FreeBSD by Boris A...
Smartcom's control plane software, a customized version of FreeBSD by Boris A...eurobsdcon
 

Tendances (17)

[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
 
Not so brief history of Linux Containers
Not so brief history of Linux ContainersNot so brief history of Linux Containers
Not so brief history of Linux Containers
 
PyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakesPyCon Poland 2016: Maintaining a high load Python project: typical mistakes
PyCon Poland 2016: Maintaining a high load Python project: typical mistakes
 
BKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle timeBKK16-203 Irq prediction or how to better estimate idle time
BKK16-203 Irq prediction or how to better estimate idle time
 
.NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov).NET Memory Primer (Martin Kulov)
.NET Memory Primer (Martin Kulov)
 
KDE Plasma Develop Intro
KDE Plasma Develop IntroKDE Plasma Develop Intro
KDE Plasma Develop Intro
 
Meetup #28
Meetup #28Meetup #28
Meetup #28
 
gRPC: Beyond REST
gRPC: Beyond RESTgRPC: Beyond REST
gRPC: Beyond REST
 
What's missing from upstream kernel containers?
What's missing from upstream kernel containers?What's missing from upstream kernel containers?
What's missing from upstream kernel containers?
 
Reliable Asynchronous Web Services on Java EE JOnAS server and Apache CXF
Reliable Asynchronous Web Services on Java EE JOnAS server and Apache CXFReliable Asynchronous Web Services on Java EE JOnAS server and Apache CXF
Reliable Asynchronous Web Services on Java EE JOnAS server and Apache CXF
 
AnyEvent and Plack
AnyEvent and PlackAnyEvent and Plack
AnyEvent and Plack
 
Tokyocabinet
TokyocabinetTokyocabinet
Tokyocabinet
 
Datafying Bitcoins
Datafying BitcoinsDatafying Bitcoins
Datafying Bitcoins
 
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
td-spark internals: Extending Spark with Airframe - Spark Meetup Tokyo #3 2020
 
Docker
DockerDocker
Docker
 
Smartcom's control plane software, a customized version of FreeBSD by Boris A...
Smartcom's control plane software, a customized version of FreeBSD by Boris A...Smartcom's control plane software, a customized version of FreeBSD by Boris A...
Smartcom's control plane software, a customized version of FreeBSD by Boris A...
 
Sprint 19
Sprint 19Sprint 19
Sprint 19
 

Similaire à HKG18-TR08 - Upstreaming SVE in QEMU

Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingMarian Marinov
 
BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr Linaro
 
Kubernetes from scratch at veepee sysadmins days 2019
Kubernetes from scratch at veepee   sysadmins days 2019Kubernetes from scratch at veepee   sysadmins days 2019
Kubernetes from scratch at veepee sysadmins days 2019🔧 Loïc BLOT
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205Linaro
 
MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103Linaro
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLinaro
 
What's New in NGINX Plus R12?
What's New in NGINX Plus R12? What's New in NGINX Plus R12?
What's New in NGINX Plus R12? NGINX, Inc.
 
How to build TiDB
How to build TiDBHow to build TiDB
How to build TiDBPingCAP
 
A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13Thibault Charbonnier
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberYing Zheng
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloudOVHcloud
 
Mirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in GoMirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in Golinuxlab_conf
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For OperatorsKevin Brockhoff
 
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceAdding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceSamsung Open Source Group
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka ReplicatorMichael Hongliang Xu
 
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameterMobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiametertelestax
 
Bsdtw17: lightning talks/wip sessions
Bsdtw17: lightning talks/wip sessionsBsdtw17: lightning talks/wip sessions
Bsdtw17: lightning talks/wip sessionsScott Tsai
 
Introducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps MeetupIntroducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps MeetupKevin Xu
 
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Taro L. Saito
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
 

Similaire à HKG18-TR08 - Upstreaming SVE in QEMU (20)

Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & Logging
 
BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr BUD17-405: Building a reference IoT product with Zephyr
BUD17-405: Building a reference IoT product with Zephyr
 
Kubernetes from scratch at veepee sysadmins days 2019
Kubernetes from scratch at veepee   sysadmins days 2019Kubernetes from scratch at veepee   sysadmins days 2019
Kubernetes from scratch at veepee sysadmins days 2019
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205
 
MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103MOVED: The challenge of SVE in QEMU - SFO17-103
MOVED: The challenge of SVE in QEMU - SFO17-103
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
 
What's New in NGINX Plus R12?
What's New in NGINX Plus R12? What's New in NGINX Plus R12?
What's New in NGINX Plus R12?
 
How to build TiDB
How to build TiDBHow to build TiDB
How to build TiDB
 
A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at Uber
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Mirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in GoMirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in Go
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux DeviceAdding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
Adding IEEE 802.15.4 and 6LoWPAN to an Embedded Linux Device
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
 
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameterMobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
Mobicents Summit 2012 - Alexandre Mendonca - Mobicents jDiameter
 
Bsdtw17: lightning talks/wip sessions
Bsdtw17: lightning talks/wip sessionsBsdtw17: lightning talks/wip sessions
Bsdtw17: lightning talks/wip sessions
 
Introducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps MeetupIntroducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps Meetup
 
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 

Plus de Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaLinaro
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaLinaro
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteLinaro
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopLinaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MLinaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootLinaro
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...Linaro
 

Plus de Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
 

Dernier

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

HKG18-TR08 - Upstreaming SVE in QEMU

  • 1. HKG18-TR08: Upstreaming SVE in QEMU Alex Bennée and Richard Henderson
  • 2. Contents ● Introductions ○ Who we are ○ What QEMU is ● The QEMU Project ○ Development Process ○ Upstreaming Criteria ● SVE Work ○ Native Vectors for TCG ○ ARMv8.2 FP16 ○ Remaining Prerequisites ○ SVE instructions ● Demo!
  • 3. Who are we? Alex Bennée Linaro Virtualization Engineer alex.bennee@linaro.org IRC: ajb-linaro/stsquad Richard Henderson Linaro Virtualization Engineer richard.henderson@linaro.org IRC: rth
  • 4. What is QEMU? “QEMU is a generic and open source machine emulator and virtualizer” ● Device Emulation for HW virtualization ○ KVM ○ HAX ○ HVF ○ Xen ● Full System Emulation ○ 20 21 guest architectures ○ TCG JIT engine ○ 7 host architectures
  • 6. Steady Development State of QEMU, Paolo Bonzini - KVM Forum 2017
  • 7. QEMU Development Basics ● Source Code ○ GIT: https://git.qemu.org ○ Master repository with submodules ○ PULL Requests merged by Head Maintainer ■ Sub-maintainers send pull-requests via list ● Discussion ○ Main List: qemu-devel@nongnu.org ○ Sub-lists: ■ ARM specific: qemu-arm@nongnu.org ■ Security: qemu-security@nongnu.org ■ Trivial Patches: qemu-trivial@nongnu.org ○ IRC: #qemu on OFTC network ● Documentation ○ Source Tree: ./docs ○ Wiki: https://wiki.qemu.org/Main_Page
  • 8.
  • 9. Maintainer/Contributor Retention since 2.0 State of QEMU, Paolo Bonzini - KVM Forum 2017
  • 10. Upstreaming Criteria ● No Regressions ○ compiles! ○ make check ○ tested ● Acceptable Code ○ Follows CODING_STYLE ○ Signed-off-by: J Hacker <j.hacker@example.com> ○ Reviewed ○ Bisectable
  • 11. ARM’s Scalable Vector Extensions ● New SIMD extension for Mobile to HPC ● IMPDEF Vector Size (128-2048* bit) ● Multiple Floating Point Precisions ○ N x 64 bit (double precision) ○ 2N x 32 bit (single precision) ○ 4N x 16 bit (half precision) ● For more details see: ○ ARM’s own blog post ○ The Challenge of SVE in QEMU (SFO17) ○ Vectors Meet Virtualization (FOSDEM18)
  • 13. Vectors in Tiny Code Generator (TCG) ● Vector Sizes ○ Larger registers (128+ bits) ○ Potentially Smaller Lanes (16 bit, 8 bit) ● TCG Op Sizes ○ TCGv_i32 ○ TCGv_i64
  • 14. 128 bit Vector Register Layout
  • 15. Marshalling Costs ● For generated code ○ More TCGops per guest instruction ○ Backend restricted to scalar instructions ○ Optimiser has to be simple and fast ● Helper Functions ○ As above but ○ Creating C ABI frame and procedure call ○ Helpers don’t vectorise ● See also: Vectoring in on QEMUs TCG Engine, KVM Forum 2017
  • 16. RFC Patches ● Request For Comment ○ “I’ve had this idea.. what do you think?” ○ Doesn’t have to be complete ● TCG_vec RFC ○ AJB : August 17th 2017 ○ 9 files changed, 222 insertions(+), 10 deletions(-) ○ RTH: August 17th 2017 ○ 11 files changed, 1817 insertions(+), 94 deletions(-)
  • 17. Ready for Upstream ● Version 11 ○ On List: Jan 26th 2018 ○ 25 files changed, 6973 insertions(+), 496 deletions(-) ○ Existing AArch64 guest translator converted to API ○ Existing x86 and AArch64 host generators extended to API ● Pull Request via TCG ○ V1 On List: Feb 7th 2018 ○ Failed pre-merge tests ○ V2 On List: Feb 8th 2018 ● Merged in master ○ since: Feb 8th 2018 ○ in upcoming v2.12
  • 18. Half Precision Floating Point ● ARMv8.2 FP16 ○ Optional feature for v8.2 ○ Mandatory for SVE ● 16 bit width ○ 5 bit exponent ○ 10 bit fraction ○ 1 sign bit ● Trade Bandwidth for Accuracy ○ Twice as many calculations than single-precision ○ Often enough accuracy ■ Some graphics calculations ■ Machine learning ● Requires SoftFloat
  • 19. Softfloat ● Why? ○ Differing number formats (e.g. ARM alternative format) ○ NaN propagation ○ Default NaN ○ Exception Mappings ● Berkeley SoftFloat ○ John R. Hauser, BSD Licensed ○ Tarball release ○ Current release 3c ● QEMU uses ○ Version 2a ○ Numerous tweaks
  • 20. Softfloat Options ● Initial Discussions ○ Our SoftFloat heavily patched ○ On List: May 2017 ○ Patch Existing, Use 3c (replace/incremental), Use glibc? ● RFC: Import SoftFloat3? ○ On List: July 2017 ○ A lot of duplication and patching ○ Would upstream take patches? ● RFC: Update 2a ourselves ○ On List: Oct 2017 ○ Included WIP FP16 work ○ Also a lot of error prone duplication
  • 21. Cambridge Sprint ● Nov 2017 ○ Pair programming (Alex & Rich) ○ New re-factored SoftFloat ● V1 re-factor softfloat and add fp16 functions ○ On List: Dec 2017 ○ 4 files changed, 3066 insertions(+), 3850 deletions(-) ○ No ARM FP16 in series ● V3 re-factor softfloat and add fp16 functions ○ On List: Jan 2018 ○ 49 files changed, 2188 insertions(+), 2940 deletions(-) ○ but….
  • 23. Packed Parts /* * Structure holding all of the decomposed parts of a float. The * exponent is unbiased and the fraction is normalized. All * calculations are done with a 64 bit fraction and then rounded as * appropriate for the final format. * * Thanks to the packed FloatClass a decent compiler should be able to * fit the whole structure into registers and avoid using the stack * for parameter passing. */ typedef struct { uint64_t frac; int32_t exp; FloatClass cls; bool sign; } FloatParts;
  • 24. addsub_floats /* * Returns the result of adding or subtracting the floating-point * values `a' and `b'. The operation is performed according to the * IEC/IEEE Standard for Binary Floating-Point Arithmetic. */ float16 __attribute__((flatten)) float16_add(float16 a, float16 b, float_status *status) { FloatParts pa = float16_unpack_canonical(a, status); FloatParts pb = float16_unpack_canonical(b, status); FloatParts pr = addsub_floats(pa, pb, false, status); return float16_round_pack_canonical(pr, status); }
  • 26. Other Instruction Patches ● ARMv8.1-SIMD ○ Additional Advanced SIMD instructions ○ Rounding Double Multiply Add/Subtract ○ Forms the base decode path for ARMv8.3-CompNum ● ARMv8.3-CompNum ○ Additional Advanced SIMD instructions ○ Complex add, multiply accumulate ○ Mandatory for SVE ● ARM v8.1 simd + v8.3 complex insns ○ Gated by SoftFloat FP16 work ○ V3 on list: Feb 2018 ○ 9 files changed, 1133 insertions(+), 126 deletions(-) ○ Merged: Mar 2018
  • 27. Other plumbing ● Decoder Script ○ Help generate regular decoder ○ Based on instruction bit patterns ● target/arm: Preparatory work for SVE ○ Expand SIMD registers for SVE ○ Access helpers ○ Migration Support ● linux-user support ○ Add extended signal frame ○ Important for testing with RISU
  • 28. SVE Patches ● target/arm: Scalable Vector Extension ○ V2 on list: Feb 2018 ○ 13 files changed, 11408 insertions(+), 91 deletions(-) ○ 67 patches ○ 99% complete linux-user support ○ No FFR for now ● Call for review ○ Now is the time to review and test ○ On course for 2.13 (est Aug 2018) ○ https://github.com/rth7680/qemu/tree/tgt-arm-sve-8
  • 29. Demo ● Himeno Benchmark ○ Advanced Centre for Computing and Communication @ Riken ○ Incompressible Fluid Analysis ● LULESH Benchmark ○ Lawrence Livermore National Lab ○ Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)
  • 30. Potential Future Work ● First Fault/Non-fault loads ○ Simple handling of page boundaries ● SVE System Mode ○ Mostly system registers ● Performance Work ○ Better host register use? ○ Use host FPU instead of SoftFloat? ■ I posted an experimental hack with v4 ■ Emilio Cota @UC posted RFC yesterday
  • 31.
  • 32. Thank You #HKG18 HKG18 keynotes and videos on: connect.linaro.org For further information: www.linaro.org