The document discusses implementing PCIe Address Translation Services (ATS) in ARM-based systems-on-chips (SoCs). It describes an example ARM server system with various components like CPUs, memory controllers, and I/O devices. It then explains how ATS works to improve memory access performance by allowing devices to cache address translations locally instead of relying solely on the IOMMU. The document outlines the typical components involved in ATS like the address translation cache, translating agent, and address translation protection table. It also describes how the ARM System MMU (SMMU) implements ATS and supports distributed address translation caching by endpoints.
This document provides an agenda and overview for a hands-on lab on using DPDK in containers. It introduces Linux containers and how they use fewer system resources than VMs. It discusses how containers still use the kernel network stack, which is not ideal for SDN/NFV usages, and how DPDK can be used in containers to address this. The hands-on lab section guides users through building DPDK and Open vSwitch, configuring them to work with containers, and running packet generation and forwarding using testpmd and pktgen Docker containers connected via Open vSwitch.
This document discusses debugging the ACPI subsystem in the Linux kernel. It provides an overview of the ACPI subsystem and components like ACPICA and the namespace. It describes how to enable ACPI debug logging via the acpi.debug_layer and acpi.debug_level kernel parameters. It also covers overriding ACPI definition blocks tables and tracing ACPI temperature as a debugging case study.
DPDK is a set of drivers and libraries that allow applications to bypass the Linux kernel and access network interface cards directly for very high performance packet processing. It is commonly used for software routers, switches, and other network applications. DPDK can achieve over 11 times higher packet forwarding rates than applications using the Linux kernel network stack alone. While it provides best-in-class performance, DPDK also has disadvantages like reduced security and isolation from standard Linux services.
The document discusses implementing PCIe Address Translation Services (ATS) in ARM-based systems-on-chips (SoCs). It describes an example ARM server system with various components like CPUs, memory controllers, and I/O devices. It then explains how ATS works to improve memory access performance by allowing devices to cache address translations locally instead of relying solely on the IOMMU. The document outlines the typical components involved in ATS like the address translation cache, translating agent, and address translation protection table. It also describes how the ARM System MMU (SMMU) implements ATS and supports distributed address translation caching by endpoints.
This document provides an agenda and overview for a hands-on lab on using DPDK in containers. It introduces Linux containers and how they use fewer system resources than VMs. It discusses how containers still use the kernel network stack, which is not ideal for SDN/NFV usages, and how DPDK can be used in containers to address this. The hands-on lab section guides users through building DPDK and Open vSwitch, configuring them to work with containers, and running packet generation and forwarding using testpmd and pktgen Docker containers connected via Open vSwitch.
This document discusses debugging the ACPI subsystem in the Linux kernel. It provides an overview of the ACPI subsystem and components like ACPICA and the namespace. It describes how to enable ACPI debug logging via the acpi.debug_layer and acpi.debug_level kernel parameters. It also covers overriding ACPI definition blocks tables and tracing ACPI temperature as a debugging case study.
DPDK is a set of drivers and libraries that allow applications to bypass the Linux kernel and access network interface cards directly for very high performance packet processing. It is commonly used for software routers, switches, and other network applications. DPDK can achieve over 11 times higher packet forwarding rates than applications using the Linux kernel network stack alone. While it provides best-in-class performance, DPDK also has disadvantages like reduced security and isolation from standard Linux services.
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...Yasunori Goto
The document summarizes the current status and issues with NVDIMM support on the Linux kernel, specifically with Filesystem Direct Access (DAX). It discusses two main issues that have made Filesystem DAX experimental: 1) updating metadata when data is written directly without going through the page cache, and 2) unbinding a namespace that is currently in use can forcibly remove it. It also covers other challenges like enabling/disabling DAX on a per-inode basis and supporting copy-on-write features with a DAX filesystem. The presentation will go into more details on solving specific issues like supporting reflink/dedupe for filesystem DAX and fixing reverse mapping with NVDIMM.
The IBM POWER10 processor represents the 10th generation of the POWER family of enterprise computing engines. Its performance is a result of both powerful processing cores and high-bandwidth intra- and inter-chip interconnect. POWER10 systems can be configured with up to 16 processor chips and 1920 simultaneous threads of execution. Cross-system memory sharing, through the new Memory Inception technology, and 2 Petabytes of addressing space support an expansive memory system. The POWER10 processing core has been significantly enhanced over its POWER9 predecessor, including a doubling of vector units and the addition of an all-new matrix math engine. Throughput gains from POWER9 to POWER10 average 30% at the core level and three-fold at the socket level. Those gains can reach ten- or twenty-fold at the socket level for matrix-intensive computations.
Osic tech talk presentation on ironic inspectorAnnie Lezil
The document discusses the Ironic Inspector service, which discovers hardware properties of bare metal nodes through introspection. It describes the general workflow where a node is enrolled, brought to a manageable state, and has its hardware inventory collected by booting an Ironic Python Agent ramdisk. The collected data is processed by plugins, stored in Swift according to introspection rules, and used to update the node's properties. It also covers capabilities detection, optional collectors like Dmidecode and Biosdevname, and CLI commands for managing introspection.
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionProject ACRN
This document discusses shared memory based inter-VM communication using Ivshmem in ACRN. It provides an overview of Ivshmem, describes the DM-Land and HV-Land architectures in ACRN for emulating the Ivshmem device, discusses limitations of the HV-Land approach, and provides examples of configuring and using Ivshmem for inter-VM shared memory communication.
Profiling the ACPICA Namespace and Event HandingSUSE Labs Taipei
This document summarizes a presentation about profiling the ACPI namespace and event handling. It discusses the key components of ACPICA and how it interacts with the Linux kernel. It describes how definition blocks are parsed to build the ACPI namespace, and how fixed events and general purpose events (GPEs) are handled through event detection and dispatching to handlers in the kernel or by evaluating control methods.
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
Speaker: Bill Fletcher
Date: September 24, 2015
★ Session Description ★
An introductory session of a system-level overview at Power State Coordination
- Focus on ARMv8
- Goes top-down from ACPI
- A demo based on the current code in qemu
- The specifications are very dynamic - what’s onging for ACPI and PSCI
★ Resources ★
Video: https://www.youtube.com/watch?v=vXzPdpaZVto
Presentation: http://www.slideshare.net/linaroorg/sfo15tr9-psci-acpi-and-uefi-to-boot
Etherpad: pad.linaro.org/p/sfo15-tr9
Pathable: https://sfo15.pathable.com/meetings/303087
★ Event Details ★
Linaro Connect San Francisco 2015 - #SFO15
September 21-25, 2015
Hyatt Regency Hotel
http://www.linaro.org
http://connect.linaro.org
This document discusses adding support for PCI Express and new chipset emulation to Qemu. It introduces a new Q35 chipset emulator with support for 64-bit BAR, PCIe MMCONFIG, multiple PCI buses and slots. Future work includes improving PCIe hotplug, passthrough and power management as well as switching the BIOS to SeaBIOS and improving ACPI table support. The goal is to modernize Qemu's emulation of PCI features to match capabilities of newer hardware.
Accelerating Virtual Machine Access with the Storage Performance Development ...Michelle Holley
Abstract: Although new non-volatile media inherently offers very low latency, remote access
using protocols such as NVMe-oF and presenting the data to VMs via virtualized interfaces such as virtio
adds considerable software overhead. One way to reduce the overhead is to use the Storage
Performance Development Kit (SPDK), an open-source software project that provides building blocks for
scalable and efficient storage applications with breakthrough performance. Comparing the software
paths for virtualizing block storage I/O illustrates the advantages of the SPDK-based approach. Empirical
data shows that using SPDK can improve CPU efficiency by up to 10 x and reduce latency up to 50% over
existing methods. Future enhancements for SPDK will make its advantages even greater.
Speaker Bio: Anu Rao is Product line manager for storage software in Data center Group. She helps
customer ease into and adopt open source Storage software like Storage Performance Development Kit
(SPDK) and Intelligent Software Acceleration-Library (ISA-L).
This presentation introduces Data Plane Development Kit overview and basics. It is a part of a Network Programming Series.
First, the presentation focuses on the network performance challenges on the modern systems by comparing modern CPUs with modern 10 Gbps ethernet links. Then it touches memory hierarchy and kernel bottlenecks.
The following part explains the main DPDK techniques, like polling, bursts, hugepages and multicore processing.
DPDK overview explains how is the DPDK application is being initialized and run, touches lockless queues (rte_ring), memory pools (rte_mempool), memory buffers (rte_mbuf), hashes (rte_hash), cuckoo hashing, longest prefix match library (rte_lpm), poll mode drivers (PMDs) and kernel NIC interface (KNI).
At the end, there are few DPDK performance tips.
Tags: access time, burst, cache, dpdk, driver, ethernet, hub, hugepage, ip, kernel, lcore, linux, memory, pmd, polling, rss, softswitch, switch, userspace, xeon
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Kirill Tsym discusses Vector Packet Processing:
* Linux Kernel data path (in short), initial design, today's situation, optimization initiatives
* Brief overview of DPDK, Netmap, etc.
* Userspace Networking projects comparison: OpenFastPath, OpenSwitch, VPP.
* Introduction to VPP: architecture, capabilities and optimization techniques.
* Basic Data Flow and introduction to vectors.
* VPP Single and Multi-thread modes.
* Router and switch for namespaces example.
* VPP L4 protocol processing - Transport Layer Development Kit.
* VPP Plugins.
Kiril is a software developer at Check Point Software Technologies, part of Next Generation Gateway and Architecture team, developing proof of concept around DPDK and FD.IO VPP. He has years of experience in software, Linux kernel and networking development and has worked for Polycom, Broadcom and Qualcomm before joining Check Point.
The document discusses QEMU and adding a new device to it. It begins with an introduction to QEMU and its uses. It then discusses setting up a development environment, compiling QEMU, and examples of existing devices. The main part explains how to add a new "Devix" device by creating source files, registering the device type, initializing PCI configuration, and registering memory regions. It demonstrates basic functionality like interrupts and I/O access callbacks. The goal is to introduce developing new emulated devices for QEMU.
Enable DPDK and SR-IOV for containerized virtual network functions with zunheut2008
Zun is an OpenStack service that manages containers as first-class resources without relying on virtual machines. The document discusses enabling DPDK and SR-IOV support in Zun to accelerate containerized network functions (NFV). It outlines challenges in using containers for NFV and how Zun addresses gaps. Benchmark tests show containers leveraging DPDK and SR-IOV through Zun can achieve near-physical server performance for networking workloads.
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...Yasunori Goto
The document summarizes the current status and issues with NVDIMM support on the Linux kernel, specifically with Filesystem Direct Access (DAX). It discusses two main issues that have made Filesystem DAX experimental: 1) updating metadata when data is written directly without going through the page cache, and 2) unbinding a namespace that is currently in use can forcibly remove it. It also covers other challenges like enabling/disabling DAX on a per-inode basis and supporting copy-on-write features with a DAX filesystem. The presentation will go into more details on solving specific issues like supporting reflink/dedupe for filesystem DAX and fixing reverse mapping with NVDIMM.
The IBM POWER10 processor represents the 10th generation of the POWER family of enterprise computing engines. Its performance is a result of both powerful processing cores and high-bandwidth intra- and inter-chip interconnect. POWER10 systems can be configured with up to 16 processor chips and 1920 simultaneous threads of execution. Cross-system memory sharing, through the new Memory Inception technology, and 2 Petabytes of addressing space support an expansive memory system. The POWER10 processing core has been significantly enhanced over its POWER9 predecessor, including a doubling of vector units and the addition of an all-new matrix math engine. Throughput gains from POWER9 to POWER10 average 30% at the core level and three-fold at the socket level. Those gains can reach ten- or twenty-fold at the socket level for matrix-intensive computations.
Osic tech talk presentation on ironic inspectorAnnie Lezil
The document discusses the Ironic Inspector service, which discovers hardware properties of bare metal nodes through introspection. It describes the general workflow where a node is enrolled, brought to a manageable state, and has its hardware inventory collected by booting an Ironic Python Agent ramdisk. The collected data is processed by plugins, stored in Swift according to introspection rules, and used to update the node's properties. It also covers capabilities detection, optional collectors like Dmidecode and Biosdevname, and CLI commands for managing introspection.
ACRN vMeet-Up EU 2021 - shared memory based inter-vm communication introductionProject ACRN
This document discusses shared memory based inter-VM communication using Ivshmem in ACRN. It provides an overview of Ivshmem, describes the DM-Land and HV-Land architectures in ACRN for emulating the Ivshmem device, discusses limitations of the HV-Land approach, and provides examples of configuring and using Ivshmem for inter-VM shared memory communication.
Profiling the ACPICA Namespace and Event HandingSUSE Labs Taipei
This document summarizes a presentation about profiling the ACPI namespace and event handling. It discusses the key components of ACPICA and how it interacts with the Linux kernel. It describes how definition blocks are parsed to build the ACPI namespace, and how fixed events and general purpose events (GPEs) are handled through event detection and dispatching to handlers in the kernel or by evaluating control methods.
SFO15-TR9: PSCI, ACPI (and UEFI to boot)
Speaker: Bill Fletcher
Date: September 24, 2015
★ Session Description ★
An introductory session of a system-level overview at Power State Coordination
- Focus on ARMv8
- Goes top-down from ACPI
- A demo based on the current code in qemu
- The specifications are very dynamic - what’s onging for ACPI and PSCI
★ Resources ★
Video: https://www.youtube.com/watch?v=vXzPdpaZVto
Presentation: http://www.slideshare.net/linaroorg/sfo15tr9-psci-acpi-and-uefi-to-boot
Etherpad: pad.linaro.org/p/sfo15-tr9
Pathable: https://sfo15.pathable.com/meetings/303087
★ Event Details ★
Linaro Connect San Francisco 2015 - #SFO15
September 21-25, 2015
Hyatt Regency Hotel
http://www.linaro.org
http://connect.linaro.org
This document discusses adding support for PCI Express and new chipset emulation to Qemu. It introduces a new Q35 chipset emulator with support for 64-bit BAR, PCIe MMCONFIG, multiple PCI buses and slots. Future work includes improving PCIe hotplug, passthrough and power management as well as switching the BIOS to SeaBIOS and improving ACPI table support. The goal is to modernize Qemu's emulation of PCI features to match capabilities of newer hardware.
Accelerating Virtual Machine Access with the Storage Performance Development ...Michelle Holley
Abstract: Although new non-volatile media inherently offers very low latency, remote access
using protocols such as NVMe-oF and presenting the data to VMs via virtualized interfaces such as virtio
adds considerable software overhead. One way to reduce the overhead is to use the Storage
Performance Development Kit (SPDK), an open-source software project that provides building blocks for
scalable and efficient storage applications with breakthrough performance. Comparing the software
paths for virtualizing block storage I/O illustrates the advantages of the SPDK-based approach. Empirical
data shows that using SPDK can improve CPU efficiency by up to 10 x and reduce latency up to 50% over
existing methods. Future enhancements for SPDK will make its advantages even greater.
Speaker Bio: Anu Rao is Product line manager for storage software in Data center Group. She helps
customer ease into and adopt open source Storage software like Storage Performance Development Kit
(SPDK) and Intelligent Software Acceleration-Library (ISA-L).
This presentation introduces Data Plane Development Kit overview and basics. It is a part of a Network Programming Series.
First, the presentation focuses on the network performance challenges on the modern systems by comparing modern CPUs with modern 10 Gbps ethernet links. Then it touches memory hierarchy and kernel bottlenecks.
The following part explains the main DPDK techniques, like polling, bursts, hugepages and multicore processing.
DPDK overview explains how is the DPDK application is being initialized and run, touches lockless queues (rte_ring), memory pools (rte_mempool), memory buffers (rte_mbuf), hashes (rte_hash), cuckoo hashing, longest prefix match library (rte_lpm), poll mode drivers (PMDs) and kernel NIC interface (KNI).
At the end, there are few DPDK performance tips.
Tags: access time, burst, cache, dpdk, driver, ethernet, hub, hugepage, ip, kernel, lcore, linux, memory, pmd, polling, rss, softswitch, switch, userspace, xeon
The Linux Block Layer - Built for Fast StorageKernel TLV
The arrival of flash storage introduced a radical change in performance profiles of direct attached devices. At the time, it was obvious that Linux I/O stack needed to be redesigned in order to support devices capable of millions of IOPs, and with extremely low latency.
In this talk we revisit the changes the Linux block layer in the
last decade or so, that made it what it is today - a performant, scalable, robust and NUMA-aware subsystem. In addition, we cover the new NVMe over Fabrics support in Linux.
Sagi Grimberg
Sagi is Principal Architect and co-founder at LightBits Labs.
Kirill Tsym discusses Vector Packet Processing:
* Linux Kernel data path (in short), initial design, today's situation, optimization initiatives
* Brief overview of DPDK, Netmap, etc.
* Userspace Networking projects comparison: OpenFastPath, OpenSwitch, VPP.
* Introduction to VPP: architecture, capabilities and optimization techniques.
* Basic Data Flow and introduction to vectors.
* VPP Single and Multi-thread modes.
* Router and switch for namespaces example.
* VPP L4 protocol processing - Transport Layer Development Kit.
* VPP Plugins.
Kiril is a software developer at Check Point Software Technologies, part of Next Generation Gateway and Architecture team, developing proof of concept around DPDK and FD.IO VPP. He has years of experience in software, Linux kernel and networking development and has worked for Polycom, Broadcom and Qualcomm before joining Check Point.
The document discusses QEMU and adding a new device to it. It begins with an introduction to QEMU and its uses. It then discusses setting up a development environment, compiling QEMU, and examples of existing devices. The main part explains how to add a new "Devix" device by creating source files, registering the device type, initializing PCI configuration, and registering memory regions. It demonstrates basic functionality like interrupts and I/O access callbacks. The goal is to introduce developing new emulated devices for QEMU.
Enable DPDK and SR-IOV for containerized virtual network functions with zunheut2008
Zun is an OpenStack service that manages containers as first-class resources without relying on virtual machines. The document discusses enabling DPDK and SR-IOV support in Zun to accelerate containerized network functions (NFV). It outlines challenges in using containers for NFV and how Zun addresses gaps. Benchmark tests show containers leveraging DPDK and SR-IOV through Zun can achieve near-physical server performance for networking workloads.
The valve timing of modern automobiles are not constant. It varies with speed, load. Here this slide describes about the modern Variable Valve Timing system and their controls
Variable valve timing allows the valve opening and closing points to be adjusted based on engine speed and load conditions. This improves engine performance and fuel economy over a fixed valve timing system. Variable valve timing is achieved through advance/retard systems that adjust the cam shaft timing using a movable tensioner controlled by the engine control module. Another method uses multiple cam profiles activated by oil pressure to vary the valve lift and duration. Variable valve timing provides benefits like better fuel efficiency, torque, and emissions but at a higher cost and complexity over a standard cam shaft setup.
This document defines functions for registering a Quagga routing daemon module with a gRPC server and handling configuration changes. It registers the "quaggad" module and a show command via gRPC. It also defines a function to subscribe to configuration changes and send updates to a gRPC stream. When a change is received, the quaggaConfig function validates and applies the change or synchronizes state.
The YANG syntax is similar to C and C++ and uses a C-like syntax that was chosen for its readability. This section introduces the YANG syntax. While SMIv1, SMIv2, and SPPI are bound to specific protocols like SNMP and COPS-PR, the purpose of SMIng is to define a common data definition language that can specify data models independently of protocols.
3. Benefits
• They are capable of having traffic routed to them
• They are capable of passing routing protocols over
them
• They do not require local or remote subnets to be
specified
• They operate as if the peer interfaces are directly
connected
(Vyatta VPN Reference Guide 6.5R1 v01 より引用)