SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
RAS: What is it? Why do we need it?
Harb Abdulhamid (Qualcomm)
Fu Wei (Red Hat)
Yazen Ghannam (AMD)
ENGINEERS AND DEVICES
WORKING TOGETHER
What is it?
● Reliability
○ Computation needs be correct and reliable.
○ Failures and errors need be detected and reported.
○ Computation needs to fail when an error is not handled.
● Availability
○ System needs to remain available as long as possible.
○ Errors should be corrected and failures handled so that operation can continue.
● Serviceability
○ System should provide information to administrator to aid in system servicing.
○ Service time needs to be minimized to maximize uptime.
ENGINEERS AND DEVICES
WORKING TOGETHER
Why do we need it?
● Increase in system uptime (productivity)
● Less time spent debugging bad or failing hardware (productivity/cost)
● Fewer hardware replacement calls (cost/mindshare)
ENGINEERS AND DEVICES
WORKING TOGETHER
Hardware Architecture (How do we do it?)
● x86: Machine Check Exceptions (MCE) & Machine Check Architecture (MCA)
○ Architectural features/extensions.
○ Defines a register set that can be used for multiple devices (IMPORTANT!).
○ Poll for correctable errors.
○ APIC LVT or SMI interrupts for correctable thresholding and deferred errors.
○ MCE for uncorrectable errors.
● PCI-E: Advanced Error Reporting (AER)
○ Similar concepts to MCE/MCA.
● Implementation-specific features
○ ECC in memory controllers
○ ECC in I/O RAMs
○ Poison/bad data markers
○ Flooding I/O links (e.g. Sync Flood)
ENGINEERS AND DEVICES
WORKING TOGETHER
Platform Firmware (How do we do it?)
● Platform Firmware has intimate knowledge of the system and can handle RAS
features not available through standardized mechanisms.
● Privileged code runs on the main cores or a separate microcontroller.
● Can mask registers from OS view and handle interrupts.
● Handling can be done without OS’s knowledge and information can be
exposed to OS if desired.
● Preferably, will use a standard mechanism, like ACPI, to inform the OS of errors.
● Can directly inform sysadmin of errors using sideband communications like a
baseboard management controller (BMC).
● Can pinpoint bad hardware for easy replacement.
ENGINEERS AND DEVICES
WORKING TOGETHER
Kernel (How do we do it?)
● Error Detect and Correct (EDAC) for system-specific handling and decoding.
● ISA-specific handling in /arch.
● Drivers for PCI-E AER and ACPI.
● Ideally, most RAS code in the Kernel would be obsoleted by Platform Firmware
handling of errors.
● Kernel could then be only responsible for reporting errors received through
standard mechanisms (e.g. ACPI).
● Kernel could also perform error handling relevant at the kernel-level (e.g. killing
processes or retiring bad/poisoned pages).
ENGINEERS AND DEVICES
WORKING TOGETHER
User-space (How do we do it?)
● Mcelog
○ Generally considered obsolete.
○ X86 only.
○ Reads data from /dev/mcelog.
● Rasdaemon
○ More active.
○ Can be updated to handle various platforms.
○ Reads data from Kernel tracepoints.
○ Can effectively obsolete EDAC modules for error decoding.
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI (How do we do it?)
● We’ll get into this next...
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI APEI BERT
● Scenarios : Record errors in
emergency (OS crash/reset)
● BERT:Boot Error Record Table
● Mechanism : report unhandled
errors that occurred in a previous
boot.
○ WHERE are the error records
ENGINEERS AND DEVICES
WORKING TOGETHER
UEFI spec CPER
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI APEI BERT
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI APEI HEST
● Scenarios : Record errors in runtime
(OS still can work)
● HEST:Hardware Error Source Table
● Mechanism : describes a
standardized mechanism platforms
may use to describe their error
sources by Error Source Structure:
○ HOW to inform
○ WHERE are the error records
○ WHEN records can be free
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI APEI HEST
● Error Source Structure :
○ For IA-32 : MCE/CMC/NMI
○ For PCI: AER Root Port/Endpoint/Bridge
○ Generic Hardware : GHES V1/V2
● For ARM64 : GHES v2
○ HOW to inform : Notification Structure
○ WHERE are the error records: Error
Status Address (GAS : Generic Address
Structure)
○ WHEN records can be free:Read Ack
Register
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI APEI HEST
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI APEI ERST
● Scenarios : Record and Retrieve errors in
persistent storage
● ERST:Error Record Serialization Table
● Mechanism : Operation abstract, provides
details necessary to communicate with
on-board persistent storage
● Plan B: use the UEFI runtime variable services
to carry out error record persistence
operations
ENGINEERS AND DEVICES
WORKING TOGETHER
ACPI APEI EINJ
● Scenarios : Test OSPM error handling stack
● EINJ:Error Injection Table
● Mechanism : Operation abstract, provides a
generic interface which OSPM can inject
hardware errors to the platform without
requiring platform specific software.
ENGINEERS AND DEVICES
WORKING TOGETHER
RAS on ARM64
● Architectural support for RAS is not available but not needed.
● In other words, no need to follow the same historical path as other
architectures.
● Focus should be on Platform Firmware handling of errors.
● Reporting should be through standard methods like ACPI.
● Will possibly need to implement kernel-relevant error handling based on
information received from Platform Firmware.
ENGINEERS AND DEVICES
WORKING TOGETHER
Current Work
● Add support for ACPI RAS features.
● Testing Platform Firmware to OS interface.
● No platform-specific RAS feature testing.
● Using modified QEMU for testing.
ENGINEERS AND DEVICES
WORKING TOGETHER
Future Work
● Finish ACPI implementation.
● Investigate kernel handling of poisoned pages and processes.
● Investigate I/O-related error handling in the Kernel.
ENGINEERS AND DEVICES
WORKING TOGETHER
Demo
Thank You
#LAS16
For further information: www.linaro.org
LAS16 keynotes and videos on: connect.linaro.org

Contenu connexe

Tendances

XPDDS17: Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...
XPDDS17:  Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...XPDDS17:  Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...
XPDDS17: Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...The Linux Foundation
 
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLinaro
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)Linaro
 
LCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLinaro
 
Chips alliance omni xtend overview
Chips alliance omni xtend overviewChips alliance omni xtend overview
Chips alliance omni xtend overviewRISC-V International
 
BUD17-400: Secure Data Path with OPTEE
BUD17-400: Secure Data Path with OPTEE BUD17-400: Secure Data Path with OPTEE
BUD17-400: Secure Data Path with OPTEE Linaro
 
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...Anne Nicolas
 
BUD17-416: Benchmark and profiling in OP-TEE
BUD17-416: Benchmark and profiling in OP-TEE BUD17-416: Benchmark and profiling in OP-TEE
BUD17-416: Benchmark and profiling in OP-TEE Linaro
 
LCA13: Power State Coordination Interface
LCA13: Power State Coordination InterfaceLCA13: Power State Coordination Interface
LCA13: Power State Coordination InterfaceLinaro
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareLinaro
 
Linux on ARM 64-bit Architecture
Linux on ARM 64-bit ArchitectureLinux on ARM 64-bit Architecture
Linux on ARM 64-bit ArchitectureRyo Jin
 
Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53
Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53
Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53KarthiSugumar
 
Student guide power systems for aix - virtualization i implementing virtual...
Student guide   power systems for aix - virtualization i implementing virtual...Student guide   power systems for aix - virtualization i implementing virtual...
Student guide power systems for aix - virtualization i implementing virtual...solarisyougood
 
Prerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrencyPrerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrencyViller Hsiao
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIAllan Cantle
 
cs8493 - operating systems unit 1
cs8493 - operating systems unit 1cs8493 - operating systems unit 1
cs8493 - operating systems unit 1SIMONTHOMAS S
 
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLinaro
 
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Linaro
 

Tendances (20)

Qemu Pcie
Qemu PcieQemu Pcie
Qemu Pcie
 
XPDDS17: Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...
XPDDS17:  Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...XPDDS17:  Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...
XPDDS17: Reworking the ARM GIC Emulation & Xen Challenges in the ARM ITS Emu...
 
LCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted FirmwareLCU13: An Introduction to ARM Trusted Firmware
LCU13: An Introduction to ARM Trusted Firmware
 
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)SFO15-TR9: PSCI, ACPI (and UEFI to boot)
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
 
LCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted FirmwareLCU14 500 ARM Trusted Firmware
LCU14 500 ARM Trusted Firmware
 
Chips alliance omni xtend overview
Chips alliance omni xtend overviewChips alliance omni xtend overview
Chips alliance omni xtend overview
 
BUD17-400: Secure Data Path with OPTEE
BUD17-400: Secure Data Path with OPTEE BUD17-400: Secure Data Path with OPTEE
BUD17-400: Secure Data Path with OPTEE
 
ARM AAE - Architecture
ARM AAE - ArchitectureARM AAE - Architecture
ARM AAE - Architecture
 
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
Kernel Recipes 2018 - Overview of SD/eMMC, their high speed modes and Linux s...
 
BUD17-416: Benchmark and profiling in OP-TEE
BUD17-416: Benchmark and profiling in OP-TEE BUD17-416: Benchmark and profiling in OP-TEE
BUD17-416: Benchmark and profiling in OP-TEE
 
LCA13: Power State Coordination Interface
LCA13: Power State Coordination InterfaceLCA13: Power State Coordination Interface
LCA13: Power State Coordination Interface
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
Linux on ARM 64-bit Architecture
Linux on ARM 64-bit ArchitectureLinux on ARM 64-bit Architecture
Linux on ARM 64-bit Architecture
 
Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53
Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53
Architecture Exploration of RISC-V Processor and Comparison with ARM Cortex-A53
 
Student guide power systems for aix - virtualization i implementing virtual...
Student guide   power systems for aix - virtualization i implementing virtual...Student guide   power systems for aix - virtualization i implementing virtual...
Student guide power systems for aix - virtualization i implementing virtual...
 
Prerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrencyPrerequisite knowledge for shared memory concurrency
Prerequisite knowledge for shared memory concurrency
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMI
 
cs8493 - operating systems unit 1
cs8493 - operating systems unit 1cs8493 - operating systems unit 1
cs8493 - operating systems unit 1
 
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with schedulerLCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
 
Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_Trusted firmware deep_dive_v1.0_
Trusted firmware deep_dive_v1.0_
 

En vedette

Comp tia flashcards set 1 (15 cards) acpi cmos
Comp tia flashcards set 1 (15 cards) acpi   cmosComp tia flashcards set 1 (15 cards) acpi   cmos
Comp tia flashcards set 1 (15 cards) acpi cmosSue Long Smith
 
LCU13: ACPI power state mapping
LCU13: ACPI power state mappingLCU13: ACPI power state mapping
LCU13: ACPI power state mappingLinaro
 
The e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernationThe e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernationjoeylikernel
 
Extracting Linux kernel feature model changes with FMDiff
Extracting Linux kernel feature model changes with FMDiff Extracting Linux kernel feature model changes with FMDiff
Extracting Linux kernel feature model changes with FMDiff NicoDintzner
 
ODP IPsec lookaside API Demo
ODP IPsec lookaside API DemoODP IPsec lookaside API Demo
ODP IPsec lookaside API DemoLinaro
 
LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)Linaro
 
BUD17-DF10 - Android with OPTEE/SVP and Widevine
BUD17-DF10 - Android with OPTEE/SVP and WidevineBUD17-DF10 - Android with OPTEE/SVP and Widevine
BUD17-DF10 - Android with OPTEE/SVP and WidevineLinaro
 
Kernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPIKernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPIAnne Nicolas
 
Q2.12: Power Management Across OSs
Q2.12: Power Management Across OSsQ2.12: Power Management Across OSs
Q2.12: Power Management Across OSsLinaro
 
Next event prediction
Next event predictionNext event prediction
Next event predictionLinaro
 
Note - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and InstallNote - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and Installboyw165
 
BIOS, Linux and Firmware Test Suite in-between
BIOS, Linux and  Firmware Test Suite in-betweenBIOS, Linux and  Firmware Test Suite in-between
BIOS, Linux and Firmware Test Suite in-betweenAlex Hung
 
DB410c: Face tracking and motor control
DB410c: Face tracking and motor controlDB410c: Face tracking and motor control
DB410c: Face tracking and motor controlLinaro
 
http server on user-level mTCP stack accelerated by DPDK
http server on user-level mTCP stack accelerated by DPDKhttp server on user-level mTCP stack accelerated by DPDK
http server on user-level mTCP stack accelerated by DPDKLinaro
 
ST 96Boards Demo
ST 96Boards DemoST 96Boards Demo
ST 96Boards DemoLinaro
 
Archermind demo for MTK X20 Pro and Mstar TV 96Boards
Archermind demo for MTK X20 Pro and Mstar TV 96BoardsArchermind demo for MTK X20 Pro and Mstar TV 96Boards
Archermind demo for MTK X20 Pro and Mstar TV 96BoardsLinaro
 
MEAN-stack based sensor gateway
MEAN-stack based sensor gatewayMEAN-stack based sensor gateway
MEAN-stack based sensor gatewayLinaro
 
Socionext ARMv8 server SoC chipset demo
Socionext ARMv8 server SoC chipset demoSocionext ARMv8 server SoC chipset demo
Socionext ARMv8 server SoC chipset demoLinaro
 

En vedette (20)

70 271 Stu Chap07
70 271 Stu Chap0770 271 Stu Chap07
70 271 Stu Chap07
 
Comp tia flashcards set 1 (15 cards) acpi cmos
Comp tia flashcards set 1 (15 cards) acpi   cmosComp tia flashcards set 1 (15 cards) acpi   cmos
Comp tia flashcards set 1 (15 cards) acpi cmos
 
LCU13: ACPI power state mapping
LCU13: ACPI power state mappingLCU13: ACPI power state mapping
LCU13: ACPI power state mapping
 
The e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernationThe e820 trap of Linux kernel hibernation
The e820 trap of Linux kernel hibernation
 
Extracting Linux kernel feature model changes with FMDiff
Extracting Linux kernel feature model changes with FMDiff Extracting Linux kernel feature model changes with FMDiff
Extracting Linux kernel feature model changes with FMDiff
 
Status update-qemu-pcie
Status update-qemu-pcieStatus update-qemu-pcie
Status update-qemu-pcie
 
ODP IPsec lookaside API Demo
ODP IPsec lookaside API DemoODP IPsec lookaside API Demo
ODP IPsec lookaside API Demo
 
LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)LAS16-400: Mini Conference 3 AOSP (Session 1)
LAS16-400: Mini Conference 3 AOSP (Session 1)
 
BUD17-DF10 - Android with OPTEE/SVP and Widevine
BUD17-DF10 - Android with OPTEE/SVP and WidevineBUD17-DF10 - Android with OPTEE/SVP and Widevine
BUD17-DF10 - Android with OPTEE/SVP and Widevine
 
Kernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPIKernel Recipes 2015: Representing device-tree peripherals in ACPI
Kernel Recipes 2015: Representing device-tree peripherals in ACPI
 
Q2.12: Power Management Across OSs
Q2.12: Power Management Across OSsQ2.12: Power Management Across OSs
Q2.12: Power Management Across OSs
 
Next event prediction
Next event predictionNext event prediction
Next event prediction
 
Note - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and InstallNote - (EDK2) Acpi Tables Compile and Install
Note - (EDK2) Acpi Tables Compile and Install
 
BIOS, Linux and Firmware Test Suite in-between
BIOS, Linux and  Firmware Test Suite in-betweenBIOS, Linux and  Firmware Test Suite in-between
BIOS, Linux and Firmware Test Suite in-between
 
DB410c: Face tracking and motor control
DB410c: Face tracking and motor controlDB410c: Face tracking and motor control
DB410c: Face tracking and motor control
 
http server on user-level mTCP stack accelerated by DPDK
http server on user-level mTCP stack accelerated by DPDKhttp server on user-level mTCP stack accelerated by DPDK
http server on user-level mTCP stack accelerated by DPDK
 
ST 96Boards Demo
ST 96Boards DemoST 96Boards Demo
ST 96Boards Demo
 
Archermind demo for MTK X20 Pro and Mstar TV 96Boards
Archermind demo for MTK X20 Pro and Mstar TV 96BoardsArchermind demo for MTK X20 Pro and Mstar TV 96Boards
Archermind demo for MTK X20 Pro and Mstar TV 96Boards
 
MEAN-stack based sensor gateway
MEAN-stack based sensor gatewayMEAN-stack based sensor gateway
MEAN-stack based sensor gateway
 
Socionext ARMv8 server SoC chipset demo
Socionext ARMv8 server SoC chipset demoSocionext ARMv8 server SoC chipset demo
Socionext ARMv8 server SoC chipset demo
 

Similaire à Las16 200 - firmware summit - ras what is it- why do we need it

HKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 ServersHKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 ServersLinaro
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxSamsung Open Source Group
 
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_FinalAMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_FinalAlan Su
 
Basics of Computer! BATRA COMPUTER CENTRE IN AMBALA
Basics of Computer! BATRA COMPUTER CENTRE IN AMBALABasics of Computer! BATRA COMPUTER CENTRE IN AMBALA
Basics of Computer! BATRA COMPUTER CENTRE IN AMBALAjatin batra
 
Spike yuan server ras and uefi cper final
Spike yuan  server ras and uefi cper finalSpike yuan  server ras and uefi cper final
Spike yuan server ras and uefi cper finalparth bera
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set ArchitectureJaffer Haadi
 
Developing a Windows CE OAL.ppt
Developing a Windows CE OAL.pptDeveloping a Windows CE OAL.ppt
Developing a Windows CE OAL.pptKundanSingh887495
 
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...The Linux Foundation
 
Optimizing Python
Optimizing PythonOptimizing Python
Optimizing PythonAdimianBE
 
Understanding and Improving Device Access Complexity
Understanding and Improving Device Access ComplexityUnderstanding and Improving Device Access Complexity
Understanding and Improving Device Access Complexityasimkadav
 
IO and file systems
IO and file systems IO and file systems
IO and file systems EktaVaswani2
 
Uni Processor Architecture
Uni Processor ArchitectureUni Processor Architecture
Uni Processor ArchitectureAshish KC
 
AVR_Course_Day4 introduction to microcontroller
AVR_Course_Day4 introduction to microcontrollerAVR_Course_Day4 introduction to microcontroller
AVR_Course_Day4 introduction to microcontrollerMohamed Ali
 
ARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_Architecture
ARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_ArchitectureARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_Architecture
ARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_ArchitectureRaahul Raghavan
 
Ch1 it1 - v4.0 - 87.8%
Ch1   it1 - v4.0 - 87.8%Ch1   it1 - v4.0 - 87.8%
Ch1 it1 - v4.0 - 87.8%chikoecko
 

Similaire à Las16 200 - firmware summit - ras what is it- why do we need it (20)

HKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 ServersHKG18-116 - RAS Solutions for Arm64 Servers
HKG18-116 - RAS Solutions for Arm64 Servers
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on Linux
 
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_FinalAMulti-coreSoftwareHardwareCo-DebugPlatform_Final
AMulti-coreSoftwareHardwareCo-DebugPlatform_Final
 
Basics of Computer! BATRA COMPUTER CENTRE IN AMBALA
Basics of Computer! BATRA COMPUTER CENTRE IN AMBALABasics of Computer! BATRA COMPUTER CENTRE IN AMBALA
Basics of Computer! BATRA COMPUTER CENTRE IN AMBALA
 
CS6401 Operating Systems
CS6401 Operating SystemsCS6401 Operating Systems
CS6401 Operating Systems
 
Spike yuan server ras and uefi cper final
Spike yuan  server ras and uefi cper finalSpike yuan  server ras and uefi cper final
Spike yuan server ras and uefi cper final
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
Developing a Windows CE OAL.ppt
Developing a Windows CE OAL.pptDeveloping a Windows CE OAL.ppt
Developing a Windows CE OAL.ppt
 
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
XPDDS17: Keynote: Shared Coprocessor Framework on ARM - Oleksandr Andrushchen...
 
CPU Architecture
CPU ArchitectureCPU Architecture
CPU Architecture
 
Optimizing Python
Optimizing PythonOptimizing Python
Optimizing Python
 
Understanding and Improving Device Access Complexity
Understanding and Improving Device Access ComplexityUnderstanding and Improving Device Access Complexity
Understanding and Improving Device Access Complexity
 
IO and file systems
IO and file systems IO and file systems
IO and file systems
 
Uni Processor Architecture
Uni Processor ArchitectureUni Processor Architecture
Uni Processor Architecture
 
Io systems final
Io systems finalIo systems final
Io systems final
 
Cpu
CpuCpu
Cpu
 
AVR_Course_Day4 introduction to microcontroller
AVR_Course_Day4 introduction to microcontrollerAVR_Course_Day4 introduction to microcontroller
AVR_Course_Day4 introduction to microcontroller
 
Assignment
AssignmentAssignment
Assignment
 
ARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_Architecture
ARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_ArchitectureARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_Architecture
ARM® Cortex™ M Bootup_CMSIS_Part_3_3_Debug_Architecture
 
Ch1 it1 - v4.0 - 87.8%
Ch1   it1 - v4.0 - 87.8%Ch1   it1 - v4.0 - 87.8%
Ch1 it1 - v4.0 - 87.8%
 

Plus de Linaro

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloLinaro
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaLinaro
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraLinaro
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaLinaro
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018Linaro
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018Linaro
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...Linaro
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Linaro
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Linaro
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Linaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteLinaro
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopLinaro
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineLinaro
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allLinaro
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorLinaro
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMULinaro
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MLinaro
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation Linaro
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootLinaro
 

Plus de Linaro (20)

Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea GalloDeep Learning Neural Network Acceleration at the Edge - Andrea Gallo
Deep Learning Neural Network Acceleration at the Edge - Andrea Gallo
 
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta VekariaArm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
Arm Architecture HPC Workshop Santa Clara 2018 - Kanta Vekaria
 
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua MoraHuawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
Huawei’s requirements for the ARM based HPC solution readiness - Joshua Mora
 
Bud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qaBud17 113: distribution ci using qemu and open qa
Bud17 113: distribution ci using qemu and open qa
 
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
OpenHPC Automation with Ansible - Renato Golin - Linaro Arm HPC Workshop 2018
 
HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018HPC network stack on ARM - Linaro HPC Workshop 2018
HPC network stack on ARM - Linaro HPC Workshop 2018
 
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
It just keeps getting better - SUSE enablement for Arm - Linaro HPC Workshop ...
 
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
Intelligent Interconnect Architecture to Enable Next Generation HPC - Linaro ...
 
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Sant...
 
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
Andrew J Younge - Vanguard Astra - Petascale Arm Platform for U.S. DOE/ASC Su...
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening KeynoteHKG18-100K1 - George Grey: Opening Keynote
HKG18-100K1 - George Grey: Opening Keynote
 
HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainlineHKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
HKG18-501 - EAS on Common Kernel 4.14 and getting (much) closer to mainline
 
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and allHKG18-315 - Why the ecosystem is a wonderful thing, warts and all
HKG18-315 - Why the ecosystem is a wonderful thing, warts and all
 
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse HypervisorHKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
HKG18- 115 - Partitioning ARM Systems with the Jailhouse Hypervisor
 
HKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMUHKG18-TR08 - Upstreaming SVE in QEMU
HKG18-TR08 - Upstreaming SVE in QEMU
 
HKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8MHKG18-113- Secure Data Path work with i.MX8M
HKG18-113- Secure Data Path work with i.MX8M
 
HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation HKG18-120 - Devicetree Schema Documentation and Validation
HKG18-120 - Devicetree Schema Documentation and Validation
 
HKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted bootHKG18-223 - Trusted FirmwareM: Trusted boot
HKG18-223 - Trusted FirmwareM: Trusted boot
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Las16 200 - firmware summit - ras what is it- why do we need it

  • 1. RAS: What is it? Why do we need it? Harb Abdulhamid (Qualcomm) Fu Wei (Red Hat) Yazen Ghannam (AMD)
  • 2. ENGINEERS AND DEVICES WORKING TOGETHER What is it? ● Reliability ○ Computation needs be correct and reliable. ○ Failures and errors need be detected and reported. ○ Computation needs to fail when an error is not handled. ● Availability ○ System needs to remain available as long as possible. ○ Errors should be corrected and failures handled so that operation can continue. ● Serviceability ○ System should provide information to administrator to aid in system servicing. ○ Service time needs to be minimized to maximize uptime.
  • 3. ENGINEERS AND DEVICES WORKING TOGETHER Why do we need it? ● Increase in system uptime (productivity) ● Less time spent debugging bad or failing hardware (productivity/cost) ● Fewer hardware replacement calls (cost/mindshare)
  • 4. ENGINEERS AND DEVICES WORKING TOGETHER Hardware Architecture (How do we do it?) ● x86: Machine Check Exceptions (MCE) & Machine Check Architecture (MCA) ○ Architectural features/extensions. ○ Defines a register set that can be used for multiple devices (IMPORTANT!). ○ Poll for correctable errors. ○ APIC LVT or SMI interrupts for correctable thresholding and deferred errors. ○ MCE for uncorrectable errors. ● PCI-E: Advanced Error Reporting (AER) ○ Similar concepts to MCE/MCA. ● Implementation-specific features ○ ECC in memory controllers ○ ECC in I/O RAMs ○ Poison/bad data markers ○ Flooding I/O links (e.g. Sync Flood)
  • 5. ENGINEERS AND DEVICES WORKING TOGETHER Platform Firmware (How do we do it?) ● Platform Firmware has intimate knowledge of the system and can handle RAS features not available through standardized mechanisms. ● Privileged code runs on the main cores or a separate microcontroller. ● Can mask registers from OS view and handle interrupts. ● Handling can be done without OS’s knowledge and information can be exposed to OS if desired. ● Preferably, will use a standard mechanism, like ACPI, to inform the OS of errors. ● Can directly inform sysadmin of errors using sideband communications like a baseboard management controller (BMC). ● Can pinpoint bad hardware for easy replacement.
  • 6. ENGINEERS AND DEVICES WORKING TOGETHER Kernel (How do we do it?) ● Error Detect and Correct (EDAC) for system-specific handling and decoding. ● ISA-specific handling in /arch. ● Drivers for PCI-E AER and ACPI. ● Ideally, most RAS code in the Kernel would be obsoleted by Platform Firmware handling of errors. ● Kernel could then be only responsible for reporting errors received through standard mechanisms (e.g. ACPI). ● Kernel could also perform error handling relevant at the kernel-level (e.g. killing processes or retiring bad/poisoned pages).
  • 7. ENGINEERS AND DEVICES WORKING TOGETHER User-space (How do we do it?) ● Mcelog ○ Generally considered obsolete. ○ X86 only. ○ Reads data from /dev/mcelog. ● Rasdaemon ○ More active. ○ Can be updated to handle various platforms. ○ Reads data from Kernel tracepoints. ○ Can effectively obsolete EDAC modules for error decoding.
  • 8. ENGINEERS AND DEVICES WORKING TOGETHER ACPI (How do we do it?) ● We’ll get into this next...
  • 9. ENGINEERS AND DEVICES WORKING TOGETHER ACPI APEI BERT ● Scenarios : Record errors in emergency (OS crash/reset) ● BERT:Boot Error Record Table ● Mechanism : report unhandled errors that occurred in a previous boot. ○ WHERE are the error records
  • 10. ENGINEERS AND DEVICES WORKING TOGETHER UEFI spec CPER
  • 11. ENGINEERS AND DEVICES WORKING TOGETHER ACPI APEI BERT
  • 12. ENGINEERS AND DEVICES WORKING TOGETHER ACPI APEI HEST ● Scenarios : Record errors in runtime (OS still can work) ● HEST:Hardware Error Source Table ● Mechanism : describes a standardized mechanism platforms may use to describe their error sources by Error Source Structure: ○ HOW to inform ○ WHERE are the error records ○ WHEN records can be free
  • 13. ENGINEERS AND DEVICES WORKING TOGETHER ACPI APEI HEST ● Error Source Structure : ○ For IA-32 : MCE/CMC/NMI ○ For PCI: AER Root Port/Endpoint/Bridge ○ Generic Hardware : GHES V1/V2 ● For ARM64 : GHES v2 ○ HOW to inform : Notification Structure ○ WHERE are the error records: Error Status Address (GAS : Generic Address Structure) ○ WHEN records can be free:Read Ack Register
  • 14. ENGINEERS AND DEVICES WORKING TOGETHER ACPI APEI HEST
  • 15. ENGINEERS AND DEVICES WORKING TOGETHER ACPI APEI ERST ● Scenarios : Record and Retrieve errors in persistent storage ● ERST:Error Record Serialization Table ● Mechanism : Operation abstract, provides details necessary to communicate with on-board persistent storage ● Plan B: use the UEFI runtime variable services to carry out error record persistence operations
  • 16. ENGINEERS AND DEVICES WORKING TOGETHER ACPI APEI EINJ ● Scenarios : Test OSPM error handling stack ● EINJ:Error Injection Table ● Mechanism : Operation abstract, provides a generic interface which OSPM can inject hardware errors to the platform without requiring platform specific software.
  • 17. ENGINEERS AND DEVICES WORKING TOGETHER RAS on ARM64 ● Architectural support for RAS is not available but not needed. ● In other words, no need to follow the same historical path as other architectures. ● Focus should be on Platform Firmware handling of errors. ● Reporting should be through standard methods like ACPI. ● Will possibly need to implement kernel-relevant error handling based on information received from Platform Firmware.
  • 18. ENGINEERS AND DEVICES WORKING TOGETHER Current Work ● Add support for ACPI RAS features. ● Testing Platform Firmware to OS interface. ● No platform-specific RAS feature testing. ● Using modified QEMU for testing.
  • 19. ENGINEERS AND DEVICES WORKING TOGETHER Future Work ● Finish ACPI implementation. ● Investigate kernel handling of poisoned pages and processes. ● Investigate I/O-related error handling in the Kernel.
  • 21. Thank You #LAS16 For further information: www.linaro.org LAS16 keynotes and videos on: connect.linaro.org