SlideShare a Scribd company logo
1 of 13
Download to read offline
Xen RAS Status and Progress


                  Dugger, Donald D
                       Liu, Jinsong
                    Jiang, Yunhong
Agenda
• Xen RAS overview
• Xen RAS latest progress
   – Core error recovery
   – APEI support
   – Robust enhancement
• Call for co-work




                       Intel Confidential
                                            2
Xen RAS overview
• Xen RAS motivation
   – Error affects many VMs
   – Xen RAS: error contained and handled accordingly


• Error Handling
   – CPU/Memory error: MCA (Machine Check Architecture)
   – I/O error: AER (Advanced Error Reporting)
   – ACPI Platform Error Interfaces




                            Intel Confidential
                                                          3
MCA: Machine Check Architecture
dom0       User space tools (FMA/ Mcelog)                         domU




                  vIRQ handler       vMCE handler           vMCE handler



                    vIRQ                  vMCA
                                                                 vMCA


                                                                           XEN
 Recover action

  page offline                                                          system panic
                                      Xen MCA handler                      & reset
  cpu offline


                                     Polling          MCE/CMCI


                                                CPU                      HW




                           Intel Confidential
                                                                                       4
Xen RAS status
Item                    Status                     Comments
MCA infrastructure      supported                  Move from dom0 to hypervisor

CE and UCNA             supported                  Userspace tools logging and analysis

Uncore error recovery   supported                  Memory scrubbing error
                                                   L3 explicit write-back error
Core error recovery     WIP                        Data load error
                                                   Instruction fetch error
APEI BERT               WAIT                       Dom0 own, wait kernel ready

APEI ERST               supported                  Dom0 and hypervisor co-work

APEI EINJ               supported                  Dom0 own

APEI HEST/GHES          WIP                        Dom0 and hypervisor co-work



                              Intel Confidential
                                                                                          5
Agenda
• Xen RAS overview
• Xen RAS latest progress
   – Core error recovery
   – APEI support
   – Robust enhancement
• Call for co-work




                       Intel Confidential
                                            6
Xen RAS latest progress
• Core error recovery
   – A new MCA error type, error in current processor execution context
   – CPU tag it as action required, must deal with before execution resume
   – Currently 2 type of architecturally defined core error:
          •   Data Load Error
          •   Instruction Fetch Error


• APEI support
   – ACPI Platform Error Interfaces
   – Bring existing h/w error mechanism together as a coherent infrastructure
   – Consists of 4 separate tables
          •   Boot Error Record Table
          •   Error Record Serialization Table
          •   Error Injection Table
          •   Hardware Error Source Table
    –   Linux3.0 as dom0 save us much effort
          •   Many dom0 APEI reuse
          •   Little maintain effort, benefit from kernel improvement


                                        Intel Confidential
                                                                                7
Core Error Recovery
• Xen core error recovery
   – Basically same MCA infrastructure as uncore error recovery
   – MCE exception ISR
         •   MCE broadcast to all logical processors
         •   Error in range of hypervisor/guest
   –   If in hypervisor
         •   Reset system
              –   Worst case, cannot resume execution

   –   If in guest
         •   Trigger vMCE to affected guest
         •   Trigger vIRQ to dom0 for logging
         •   Error contained in guest
              –   Medium case, error in guest kernel, kill the guest
              –   Best case, error in guest app, kill the app

   –   Code done, need kernel core recovery to do fine-grain test



                                     Intel Confidential
                                                                       8
APEI support
• Xen APEI support
  –   BERT
       •   BOOT Error Record Table
             –   For unhandled fatal error occurred in a previous boot
       •   Xen BERT
             –   Dom0 own, wait kernel BERT ready


  –   ERST
       •   Error Record Serialization Table
             –   Save/retrieve fatal error to/from persistent storage
       •   Hypervisor ERST:
             –   Save error
       •   Dom0 ERST:
             –   Retrieve/clear error




                                        Intel Confidential
                                                                         9
APEI support
• Xen APEI support
  –   EINJ
       •     Error Injection table
              –   Mechanism through which OSPM can inject h/w errors
       •     Xen EINJ
              –   Dom0 own
              –   Test done based on current bios available error types


  –   HEST
       •     Hardware Error Source Table
              –   Platform level description of error sources and error notifications
       •     Xen HEST
              –   Dom0 own SCI logic because of acpica
              –   Hypervisor own NMI logic, Xen APEI NMI handler currently not ready
              –   Need bios ready for more error sources and notifications




                                       Intel Confidential
                                                                                        10
Robust enhancement
• Xen RAS robust enhancement
  –   Xen RAS robust challenge
        •   Buggy bios
        •   Some error types not h/w supported yet
        •   Hard to trigger errors and do auto test
  –   Our work to enhance Xen RAS robust
        •   Do some code cleanup & enhancement
        •   Current supported errors were triggered and tested
        •   QA add error-simulator tools and auto test script
        •   EINJ enabling help debug & test greatly
        •   Robust enhancement will continue w/ new platform support more error types




                                   Intel Confidential
                                                                                        11
Agenda
• Xen RAS overview
• Xen RAS latest progress
   – Core error recovery
   – APEI support
   – Robust enhancement
• Call for co-work




                       Intel Confidential
                                            12
Call for co-work
• I/O error handling
   – PCIe AER, Advanced Error Reporting
   – For device assign to dom0/pv domU
         •   Basically reuse dom0/domU AER logic
   –   For device assign to hvm
         •   Need PCIe AER support at qemu
              –   Some VALinux work on standard qemu
              –   Porting to Xen qemu with AER support




                                     Intel Confidential
                                                          13

More Related Content

What's hot

Rootlinux17: An introduction to Xen Project Virtualisation
Rootlinux17:  An introduction to Xen Project VirtualisationRootlinux17:  An introduction to Xen Project Virtualisation
Rootlinux17: An introduction to Xen Project VirtualisationThe Linux Foundation
 
Bare-Metal Hypervisor as a Platform for Innovation
Bare-Metal Hypervisor as a Platform for InnovationBare-Metal Hypervisor as a Platform for Innovation
Bare-Metal Hypervisor as a Platform for InnovationThe Linux Foundation
 
Xen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization OpportunitiesXen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization OpportunitiesThe Linux Foundation
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsThe Linux Foundation
 
Citrix XenServer 5.5 Troubleshooting
Citrix XenServer 5.5 TroubleshootingCitrix XenServer 5.5 Troubleshooting
Citrix XenServer 5.5 TroubleshootingThomas Krampe
 
Linaro connect : Introduction to Xen on ARM
Linaro connect : Introduction to Xen on ARMLinaro connect : Introduction to Xen on ARM
Linaro connect : Introduction to Xen on ARMThe Linux Foundation
 
XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...
XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...
XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...The Linux Foundation
 
Xen Project Hypervisor for the Cloud
Xen Project Hypervisor for the CloudXen Project Hypervisor for the Cloud
Xen Project Hypervisor for the CloudThe Linux Foundation
 
Xen & the Art of Virtualization
Xen & the Art of VirtualizationXen & the Art of Virtualization
Xen & the Art of VirtualizationTareque Hossain
 
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmXPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmThe Linux Foundation
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...The Linux Foundation
 
Xen Project 15 Years down the Line
Xen Project 15 Years down the LineXen Project 15 Years down the Line
Xen Project 15 Years down the LineThe Linux Foundation
 
Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...
Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...
Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...The Linux Foundation
 

What's hot (20)

Ian Pratt Nsdi Keynote Apr2008
Ian Pratt Nsdi Keynote Apr2008Ian Pratt Nsdi Keynote Apr2008
Ian Pratt Nsdi Keynote Apr2008
 
Rootlinux17: An introduction to Xen Project Virtualisation
Rootlinux17:  An introduction to Xen Project VirtualisationRootlinux17:  An introduction to Xen Project Virtualisation
Rootlinux17: An introduction to Xen Project Virtualisation
 
Bare-Metal Hypervisor as a Platform for Innovation
Bare-Metal Hypervisor as a Platform for InnovationBare-Metal Hypervisor as a Platform for Innovation
Bare-Metal Hypervisor as a Platform for Innovation
 
Xen Hypervisor
Xen HypervisorXen Hypervisor
Xen Hypervisor
 
Xen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization OpportunitiesXen PV Performance Status and Optimization Opportunities
Xen PV Performance Status and Optimization Opportunities
 
PVH : PV Guest in HVM container
PVH : PV Guest in HVM containerPVH : PV Guest in HVM container
PVH : PV Guest in HVM container
 
XS Boston 2008 Quantitative
XS Boston 2008 QuantitativeXS Boston 2008 Quantitative
XS Boston 2008 Quantitative
 
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM SystemsXPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
XPDDS18: CPUFreq in Xen on ARM - Oleksandr Tyshchenko, EPAM Systems
 
Citrix XenServer 5.5 Troubleshooting
Citrix XenServer 5.5 TroubleshootingCitrix XenServer 5.5 Troubleshooting
Citrix XenServer 5.5 Troubleshooting
 
Linaro connect : Introduction to Xen on ARM
Linaro connect : Introduction to Xen on ARMLinaro connect : Introduction to Xen on ARM
Linaro connect : Introduction to Xen on ARM
 
Xen & virtualization
Xen & virtualizationXen & virtualization
Xen & virtualization
 
Why xen slides
Why xen slidesWhy xen slides
Why xen slides
 
XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...
XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...
XPDDS18: Windows PV Drivers Project: Status and Updates - Paul Durrant, Citri...
 
Xen Project Hypervisor for the Cloud
Xen Project Hypervisor for the CloudXen Project Hypervisor for the Cloud
Xen Project Hypervisor for the Cloud
 
Xen & the Art of Virtualization
Xen & the Art of VirtualizationXen & the Art of Virtualization
Xen & the Art of Virtualization
 
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, ArmXPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
XPDDS18: The Art of Virtualizing Cache Maintenance - Julien Grall, Arm
 
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
XPDDS19: How TrenchBoot is Enabling Measured Launch for Open-Source Platform ...
 
Xen Project 15 Years down the Line
Xen Project 15 Years down the LineXen Project 15 Years down the Line
Xen Project 15 Years down the Line
 
LFCOLLAB15: Xen 4.5 and Beyond
LFCOLLAB15: Xen 4.5 and BeyondLFCOLLAB15: Xen 4.5 and Beyond
LFCOLLAB15: Xen 4.5 and Beyond
 
Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...
Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...
Dealing with Hardware Heterogeneity Using EmbeddedXEN, a Virtualization Frame...
 

Similar to Xen RAS Status and Progress

20 christian ferber xen_server_6_workshop
20 christian ferber xen_server_6_workshop20 christian ferber xen_server_6_workshop
20 christian ferber xen_server_6_workshopDigicomp Academy AG
 
Xen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilXen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilThe Linux Foundation
 
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...Peter Ocasek
 
Track A-Shmuel Panijel, Windriver
Track A-Shmuel Panijel, WindriverTrack A-Shmuel Panijel, Windriver
Track A-Shmuel Panijel, Windriverchiportal
 
COSMIC: Middleware for Xeon Phi Servers and Clusters
COSMIC: Middleware for Xeon Phi Servers and ClustersCOSMIC: Middleware for Xeon Phi Servers and Clusters
COSMIC: Middleware for Xeon Phi Servers and Clustersinside-BigData.com
 
Xen Euro Par07
Xen Euro Par07Xen Euro Par07
Xen Euro Par07congvc
 
Xen and the Art of Virtualization
Xen and the Art of VirtualizationXen and the Art of Virtualization
Xen and the Art of VirtualizationSusheel Thakur
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java DevelopersRichard McDougall
 
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, HuaweiXPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, HuaweiThe Linux Foundation
 
Deploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDeploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDenish Patel
 
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Suresh Kumar
 

Similar to Xen RAS Status and Progress (20)

Xen Summit 2009 Shanghai Ras
Xen Summit 2009 Shanghai RasXen Summit 2009 Shanghai Ras
Xen Summit 2009 Shanghai Ras
 
Chen Haibo
Chen HaiboChen Haibo
Chen Haibo
 
20 christian ferber xen_server_6_workshop
20 christian ferber xen_server_6_workshop20 christian ferber xen_server_6_workshop
20 christian ferber xen_server_6_workshop
 
XS Japan 2008 Citrix English
XS Japan 2008 Citrix EnglishXS Japan 2008 Citrix English
XS Japan 2008 Citrix English
 
Ina Pratt Fosdem Feb2008
Ina Pratt Fosdem Feb2008Ina Pratt Fosdem Feb2008
Ina Pratt Fosdem Feb2008
 
Xen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilXen Project Update LinuxCon Brazil
Xen Project Update LinuxCon Brazil
 
XS Boston 2008 OVF
XS Boston 2008 OVFXS Boston 2008 OVF
XS Boston 2008 OVF
 
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
XenServer 5.5 - Czy można zaoszczędzić na wirtualizacji serwerów? Darmowy Xen...
 
XS Boston 2008 VT-D PCI
XS Boston 2008 VT-D PCIXS Boston 2008 VT-D PCI
XS Boston 2008 VT-D PCI
 
Nakajima hvm-be final
Nakajima hvm-be finalNakajima hvm-be final
Nakajima hvm-be final
 
Track A-Shmuel Panijel, Windriver
Track A-Shmuel Panijel, WindriverTrack A-Shmuel Panijel, Windriver
Track A-Shmuel Panijel, Windriver
 
COSMIC: Middleware for Xeon Phi Servers and Clusters
COSMIC: Middleware for Xeon Phi Servers and ClustersCOSMIC: Middleware for Xeon Phi Servers and Clusters
COSMIC: Middleware for Xeon Phi Servers and Clusters
 
Xen Euro Par07
Xen Euro Par07Xen Euro Par07
Xen Euro Par07
 
XS Oracle 2009 PVOps
XS Oracle 2009 PVOpsXS Oracle 2009 PVOps
XS Oracle 2009 PVOps
 
Xen and the Art of Virtualization
Xen and the Art of VirtualizationXen and the Art of Virtualization
Xen and the Art of Virtualization
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, HuaweiXPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
XPDS16: Xen Scalability Analysis - Weidong Han, Zhichao Huang & Wei Yang, Huawei
 
Deploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQLDeploying Maximum HA Architecture With PostgreSQL
Deploying Maximum HA Architecture With PostgreSQL
 
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
Advancedperformancetroubleshootingusingesxtop 101110131727-phpapp02
 
XS Boston 2008 Self IO Emulation
XS Boston 2008 Self IO EmulationXS Boston 2008 Self IO Emulation
XS Boston 2008 Self IO Emulation
 

More from The Linux Foundation

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleThe Linux Foundation
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...The Linux Foundation
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...The Linux Foundation
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather ReportThe Linux Foundation
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...The Linux Foundation
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxThe Linux Foundation
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...The Linux Foundation
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderThe Linux Foundation
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...The Linux Foundation
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...The Linux Foundation
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixThe Linux Foundation
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdThe Linux Foundation
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...The Linux Foundation
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DThe Linux Foundation
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsThe Linux Foundation
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...The Linux Foundation
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...The Linux Foundation
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...The Linux Foundation
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEThe Linux Foundation
 
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information SecurityXPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information SecurityThe Linux Foundation
 

More from The Linux Foundation (20)

ELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made SimpleELC2019: Static Partitioning Made Simple
ELC2019: Static Partitioning Made Simple
 
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
XPDDS19 Keynote: Xen in Automotive - Artem Mygaiev, Director, Technology Solu...
 
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
XPDDS19 Keynote: Xen Project Weather Report 2019 - Lars Kurth, Director of Op...
 
XPDDS19 Keynote: Unikraft Weather Report
XPDDS19 Keynote:  Unikraft Weather ReportXPDDS19 Keynote:  Unikraft Weather Report
XPDDS19 Keynote: Unikraft Weather Report
 
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
XPDDS19 Keynote: Secret-free Hypervisor: Now and Future - Wei Liu, Software E...
 
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, XilinxXPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
XPDDS19 Keynote: Xen Dom0-less - Stefano Stabellini, Principal Engineer, Xilinx
 
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
XPDDS19 Keynote: Patch Review for Non-maintainers - George Dunlap, Citrix Sys...
 
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, BitdefenderXPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
XPDDS19: Memories of a VM Funk - Mihai Donțu, Bitdefender
 
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...OSSJP/ALS19:  The Road to Safety Certification: Overcoming Community Challeng...
OSSJP/ALS19: The Road to Safety Certification: Overcoming Community Challeng...
 
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making... OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
OSSJP/ALS19: The Road to Safety Certification: How the Xen Project is Making...
 
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, CitrixXPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
XPDDS19: Speculative Sidechannels and Mitigations - Andrew Cooper, Citrix
 
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltdXPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
XPDDS19: Keeping Coherency on Arm: Reborn - Julien Grall, Arm ltd
 
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
XPDDS19: QEMU PV Backend 'qdevification'... What Does it Mean? - Paul Durrant...
 
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&DXPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
XPDDS19: Status of PCI Emulation in Xen - Roger Pau Monné, Citrix Systems R&D
 
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM SystemsXPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
XPDDS19: [ARM] OP-TEE Mediator in Xen - Volodymyr Babchuk, EPAM Systems
 
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
XPDDS19: Bringing Xen to the Masses: The Story of Building a Community-driven...
 
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
XPDDS19: Will Robots Automate Your Job Away? Streamlining Xen Project Contrib...
 
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
XPDDS19: Client Virtualization Toolstack in Go - Nick Rosbrook & Brendan Kerr...
 
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSEXPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
XPDDS19: Core Scheduling in Xen - Jürgen Groß, SUSE
 
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information SecurityXPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
XPDDS19: Implementing AMD MxGPU - Jonathan Farrell, Assured Information Security
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 

Xen RAS Status and Progress

  • 1. Xen RAS Status and Progress Dugger, Donald D Liu, Jinsong Jiang, Yunhong
  • 2. Agenda • Xen RAS overview • Xen RAS latest progress – Core error recovery – APEI support – Robust enhancement • Call for co-work Intel Confidential 2
  • 3. Xen RAS overview • Xen RAS motivation – Error affects many VMs – Xen RAS: error contained and handled accordingly • Error Handling – CPU/Memory error: MCA (Machine Check Architecture) – I/O error: AER (Advanced Error Reporting) – ACPI Platform Error Interfaces Intel Confidential 3
  • 4. MCA: Machine Check Architecture dom0 User space tools (FMA/ Mcelog) domU vIRQ handler vMCE handler vMCE handler vIRQ vMCA vMCA XEN Recover action page offline system panic Xen MCA handler & reset cpu offline Polling MCE/CMCI CPU HW Intel Confidential 4
  • 5. Xen RAS status Item Status Comments MCA infrastructure supported Move from dom0 to hypervisor CE and UCNA supported Userspace tools logging and analysis Uncore error recovery supported Memory scrubbing error L3 explicit write-back error Core error recovery WIP Data load error Instruction fetch error APEI BERT WAIT Dom0 own, wait kernel ready APEI ERST supported Dom0 and hypervisor co-work APEI EINJ supported Dom0 own APEI HEST/GHES WIP Dom0 and hypervisor co-work Intel Confidential 5
  • 6. Agenda • Xen RAS overview • Xen RAS latest progress – Core error recovery – APEI support – Robust enhancement • Call for co-work Intel Confidential 6
  • 7. Xen RAS latest progress • Core error recovery – A new MCA error type, error in current processor execution context – CPU tag it as action required, must deal with before execution resume – Currently 2 type of architecturally defined core error: • Data Load Error • Instruction Fetch Error • APEI support – ACPI Platform Error Interfaces – Bring existing h/w error mechanism together as a coherent infrastructure – Consists of 4 separate tables • Boot Error Record Table • Error Record Serialization Table • Error Injection Table • Hardware Error Source Table – Linux3.0 as dom0 save us much effort • Many dom0 APEI reuse • Little maintain effort, benefit from kernel improvement Intel Confidential 7
  • 8. Core Error Recovery • Xen core error recovery – Basically same MCA infrastructure as uncore error recovery – MCE exception ISR • MCE broadcast to all logical processors • Error in range of hypervisor/guest – If in hypervisor • Reset system – Worst case, cannot resume execution – If in guest • Trigger vMCE to affected guest • Trigger vIRQ to dom0 for logging • Error contained in guest – Medium case, error in guest kernel, kill the guest – Best case, error in guest app, kill the app – Code done, need kernel core recovery to do fine-grain test Intel Confidential 8
  • 9. APEI support • Xen APEI support – BERT • BOOT Error Record Table – For unhandled fatal error occurred in a previous boot • Xen BERT – Dom0 own, wait kernel BERT ready – ERST • Error Record Serialization Table – Save/retrieve fatal error to/from persistent storage • Hypervisor ERST: – Save error • Dom0 ERST: – Retrieve/clear error Intel Confidential 9
  • 10. APEI support • Xen APEI support – EINJ • Error Injection table – Mechanism through which OSPM can inject h/w errors • Xen EINJ – Dom0 own – Test done based on current bios available error types – HEST • Hardware Error Source Table – Platform level description of error sources and error notifications • Xen HEST – Dom0 own SCI logic because of acpica – Hypervisor own NMI logic, Xen APEI NMI handler currently not ready – Need bios ready for more error sources and notifications Intel Confidential 10
  • 11. Robust enhancement • Xen RAS robust enhancement – Xen RAS robust challenge • Buggy bios • Some error types not h/w supported yet • Hard to trigger errors and do auto test – Our work to enhance Xen RAS robust • Do some code cleanup & enhancement • Current supported errors were triggered and tested • QA add error-simulator tools and auto test script • EINJ enabling help debug & test greatly • Robust enhancement will continue w/ new platform support more error types Intel Confidential 11
  • 12. Agenda • Xen RAS overview • Xen RAS latest progress – Core error recovery – APEI support – Robust enhancement • Call for co-work Intel Confidential 12
  • 13. Call for co-work • I/O error handling – PCIe AER, Advanced Error Reporting – For device assign to dom0/pv domU • Basically reuse dom0/domU AER logic – For device assign to hvm • Need PCIe AER support at qemu – Some VALinux work on standard qemu – Porting to Xen qemu with AER support Intel Confidential 13