Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
Process Address Space: The way to create virtual address (page table) of userspace application.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document discusses four physical memory models in Linux: flat memory model, discontinuous memory model, sparse memory model, and sparse memory virtual memmap. It describes how each model addresses physical memory (page frames) and maps them to page descriptors. The sparse memory model is currently used, using memory sections to allocate page structures and support memory hotplug. It initializes by walking memory ranges from memblocks and allocating/initializing mem_section data structures.
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document describes a memory management system using memory folios to address problems with legacy page caching and compound pages. Memory folios provide a unified interface for accessing pages and simplify operations on high-order and compound pages. Folios also improve page cache performance by maintaining a shorter LRU list with one entry per folio rather than per page.
Reverse Mapping (rmap) in Linux KernelAdrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
Process Address Space: The way to create virtual address (page table) of userspace application.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document discusses four physical memory models in Linux: flat memory model, discontinuous memory model, sparse memory model, and sparse memory virtual memmap. It describes how each model addresses physical memory (page frames) and maps them to page descriptors. The sparse memory model is currently used, using memory sections to allocate page structures and support memory hotplug. It initializes by walking memory ranges from memblocks and allocating/initializing mem_section data structures.
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document describes a memory management system using memory folios to address problems with legacy page caching and compound pages. Memory folios provide a unified interface for accessing pages and simplify operations on high-order and compound pages. Folios also improve page cache performance by maintaining a shorter LRU list with one entry per folio rather than per page.
Reverse Mapping (rmap) in Linux KernelAdrian Huang
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Decompressed vmlinux: linux kernel initialization from page table configurati...Adrian Huang
Talk about how Linux kernel initializes the page table.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document summarizes Linux synchronization mechanisms including semaphores and mutexes. It discusses:
1. Semaphores can be used to solve producer-consumer problems and are implemented using a count and wait list.
2. Mutexes enforce serialization on shared memory and have fast, mid, and slow paths for lock and unlock. The mid path uses optimistic spinning and OSQs.
3. Only the lock owner can unlock a mutex, and mutexes transition to the slow path if the owner is preempted, a spinner is preempted, or the owner sleeps.
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...Adrian Huang
This document describes setting up a QEMU virtual machine with Ubuntu 20.04.1 to debug Linux kernel code using gdb. It has a 2-socket CPU configuration with 16GB of memory and disabled KASAN and ASLR. The QEMU VM can be used to run sample code and observe Linux kernel behavior under gdb, such as setting conditional breakpoints to analyze page fault behavior for mmap addresses by referencing a gdb debugging text file.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Page cache mechanism in Linux kernel.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Anatomy of the loadable kernel module (lkm)Adrian Huang
Talk about how Linux kernel invokes your module's init function.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedAdrian Huang
This slide deck describes the Linux booting flow for x86_64 processors.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
Ftrace is the official tracer of the Linux kernel. It has been apart of Linux since 2.6.31, and has grown tremendously ever since. Ftrace’s name comes from its most powerful feature: function tracing. But the ftrace infrastructure is much more than that. It also encompasses the trace events that are used by perf, as well as kprobes that can dynamically add trace events that the user defines.
This talk will focus on learning how the kernel works by using the ftrace infrastructure. It will show how to see what happens within the kernel during a system call; learn how interrupts work; see how ones processes are being scheduled, and more. A quick introduction to some tools like trace-cmd and KernelShark will also be demonstrated.
Steven Rostedt, VMware
The document provides an overview of the initialization phase of the Linux kernel. It discusses how the kernel enables paging to transition from physical to virtual memory addresses. It then describes the various initialization functions that are called by start_kernel to initialize kernel features and architecture-specific code. Some key initialization tasks discussed include creating an identity page table, clearing BSS, and reserving BIOS memory.
Linux Kernel Booting Process (2) - For NLKBshimosawa
Describes the bootstrapping part in Linux, and related architectural mechanisms and technologies.
This is the part two of the slides, and the succeeding slides may contain the errata for this slide.
The document discusses Linux networking architecture and covers several key topics in 3 paragraphs or less:
It first describes the basic structure and layers of the Linux networking stack including the network device interface, network layer protocols like IP, transport layer, and sockets. It then discusses how network packets are managed in Linux through the use of socket buffers and associated functions. The document also provides an overview of the data link layer and protocols like Ethernet, PPP, and how they are implemented in Linux.
Linux Synchronization Mechanism: RCU (Read Copy Update)Adrian Huang
RCU (Read-Copy-Update) is a synchronization mechanism that allows for lock-free reads with concurrent updates. It achieves this through a combination of temporal and spatial synchronization. Temporal synchronization uses rcu_read_lock() and rcu_read_unlock() for readers, and synchronize_rcu() or call_rcu() for updaters. Spatial synchronization uses rcu_dereference() for readers to safely load pointers, and rcu_assign_pointer() for updaters to safely update pointers. RCU guarantees that readers will either see the old or new version of data, but not a partially updated version.
Understanding of linux kernel memory modelSeongJae Park
SeongJae Park introduces himself and his work contributing to the Linux kernel memory model documentation. He developed a guaranteed contiguous memory allocator and maintains the Korean translation of the kernel's memory barrier documentation. The document discusses how the increasing prevalence of multi-core processors requires careful programming to ensure correct parallel execution given relaxed memory ordering. It notes that compilers and CPUs optimize for instruction throughput over programmer goals, and memory accesses can be reordered in ways that affect correctness on multi-processors. Understanding the memory model is important for writing high-performance parallel code.
The document discusses ioremap and mmap functions in Linux for mapping physical addresses into the virtual address space. Ioremap is used when physical addresses are larger than the virtual address space size. It maps physical addresses to virtual addresses that can be accessed by the CPU. Mmap allows a process to map pages of a file into virtual memory. It is useful for reducing memory copies and improving performance of file read/write operations. The document outlines the functions, flags, and flows of ioremap, mmap, and implementing a custom mmap file operation for direct physical memory mapping.
Virtual File System in Linux Kernel
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Viller Hsiao presents information on Linux vsyscall and vDSO. vDSO (virtual dynamic shared object) is mapped into userspace and contains implementations of common system calls to make them faster. It gets benefits from ASLR and allows additional system calls compared to the older vsyscall method. The kernel generates the vDSO shared object which is then loaded and accessed by the glibc dynamic linker to provide optimized system call implementations to applications.
This document discusses Linux memory management. It outlines the buddy system, zone allocation, and slab allocator used by Linux to manage physical memory. It describes how pages are allocated and initialized at boot using the memory map. The slab allocator is used to optimize allocation of kernel objects and is implemented as caches of fixed-size slabs and objects. Per-CPU allocation improves performance by reducing locking and cache invalidations.
The document provides an overview of the initialization process in the Linux kernel from start_kernel to rest_init. It lists the functions called during this process organized by category including functions for initialization of multiprocessor support (SMP), memory management (MM), scheduling, timers, interrupts, and architecture specific setup. The setup_arch section focuses on x86 architecture specific initialization functions such as reserving memory regions, parsing boot parameters, initializing memory mapping and MTRRs.
This document discusses various methods for allocating memory in the Linux kernel, including kmalloc, get_free_page, vmalloc, slab allocators, and memory pools. It explains that kmalloc allocates contiguous virtual and physical pages but can degrade performance for large allocations. get_free_page allocates physical pages without clearing memory. Slab allocators improve performance by grouping allocations of similar sizes. vmalloc allocates virtually contiguous regions that are not necessarily physically contiguous. Memory pools reserve memory to guarantee allocation success but can be wasteful.
The document provides an overview of the Linux boot process from power-on to starting the kernel. It discusses:
- What bootloaders like U-Boot and GRUB do to initialize hardware and load the kernel
- The differences between ARM and x86 boot processes
- How the kernel starts as PID 1 without a userspace environment
- What initrds are and how to explore one
- How to get more debug messages from the kernel boot process
- Tips for learning more using QEMU, GDB and systemd-bootchart
The document examines the boot process in detail from a low-level perspective using tools like GDB and binutils to disassemble binaries.
Decompressed vmlinux: linux kernel initialization from page table configurati...Adrian Huang
Talk about how Linux kernel initializes the page table.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
The document summarizes Linux synchronization mechanisms including semaphores and mutexes. It discusses:
1. Semaphores can be used to solve producer-consumer problems and are implemented using a count and wait list.
2. Mutexes enforce serialization on shared memory and have fast, mid, and slow paths for lock and unlock. The mid path uses optimistic spinning and OSQs.
3. Only the lock owner can unlock a mutex, and mutexes transition to the slow path if the owner is preempted, a spinner is preempted, or the owner sleeps.
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...Adrian Huang
This document describes setting up a QEMU virtual machine with Ubuntu 20.04.1 to debug Linux kernel code using gdb. It has a 2-socket CPU configuration with 16GB of memory and disabled KASAN and ASLR. The QEMU VM can be used to run sample code and observe Linux kernel behavior under gdb, such as setting conditional breakpoints to analyze page fault behavior for mmap addresses by referencing a gdb debugging text file.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Page cache mechanism in Linux kernel.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Anatomy of the loadable kernel module (lkm)Adrian Huang
Talk about how Linux kernel invokes your module's init function.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedAdrian Huang
This slide deck describes the Linux booting flow for x86_64 processors.
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
Ftrace is the official tracer of the Linux kernel. It has been apart of Linux since 2.6.31, and has grown tremendously ever since. Ftrace’s name comes from its most powerful feature: function tracing. But the ftrace infrastructure is much more than that. It also encompasses the trace events that are used by perf, as well as kprobes that can dynamically add trace events that the user defines.
This talk will focus on learning how the kernel works by using the ftrace infrastructure. It will show how to see what happens within the kernel during a system call; learn how interrupts work; see how ones processes are being scheduled, and more. A quick introduction to some tools like trace-cmd and KernelShark will also be demonstrated.
Steven Rostedt, VMware
The document provides an overview of the initialization phase of the Linux kernel. It discusses how the kernel enables paging to transition from physical to virtual memory addresses. It then describes the various initialization functions that are called by start_kernel to initialize kernel features and architecture-specific code. Some key initialization tasks discussed include creating an identity page table, clearing BSS, and reserving BIOS memory.
Linux Kernel Booting Process (2) - For NLKBshimosawa
Describes the bootstrapping part in Linux, and related architectural mechanisms and technologies.
This is the part two of the slides, and the succeeding slides may contain the errata for this slide.
The document discusses Linux networking architecture and covers several key topics in 3 paragraphs or less:
It first describes the basic structure and layers of the Linux networking stack including the network device interface, network layer protocols like IP, transport layer, and sockets. It then discusses how network packets are managed in Linux through the use of socket buffers and associated functions. The document also provides an overview of the data link layer and protocols like Ethernet, PPP, and how they are implemented in Linux.
Linux Synchronization Mechanism: RCU (Read Copy Update)Adrian Huang
RCU (Read-Copy-Update) is a synchronization mechanism that allows for lock-free reads with concurrent updates. It achieves this through a combination of temporal and spatial synchronization. Temporal synchronization uses rcu_read_lock() and rcu_read_unlock() for readers, and synchronize_rcu() or call_rcu() for updaters. Spatial synchronization uses rcu_dereference() for readers to safely load pointers, and rcu_assign_pointer() for updaters to safely update pointers. RCU guarantees that readers will either see the old or new version of data, but not a partially updated version.
Understanding of linux kernel memory modelSeongJae Park
SeongJae Park introduces himself and his work contributing to the Linux kernel memory model documentation. He developed a guaranteed contiguous memory allocator and maintains the Korean translation of the kernel's memory barrier documentation. The document discusses how the increasing prevalence of multi-core processors requires careful programming to ensure correct parallel execution given relaxed memory ordering. It notes that compilers and CPUs optimize for instruction throughput over programmer goals, and memory accesses can be reordered in ways that affect correctness on multi-processors. Understanding the memory model is important for writing high-performance parallel code.
The document discusses ioremap and mmap functions in Linux for mapping physical addresses into the virtual address space. Ioremap is used when physical addresses are larger than the virtual address space size. It maps physical addresses to virtual addresses that can be accessed by the CPU. Mmap allows a process to map pages of a file into virtual memory. It is useful for reducing memory copies and improving performance of file read/write operations. The document outlines the functions, flags, and flows of ioremap, mmap, and implementing a custom mmap file operation for direct physical memory mapping.
Virtual File System in Linux Kernel
Note: When you view the the slide deck via web browser, the screenshots may be blurred. You can download and view them offline (Screenshots are clear).
Viller Hsiao presents information on Linux vsyscall and vDSO. vDSO (virtual dynamic shared object) is mapped into userspace and contains implementations of common system calls to make them faster. It gets benefits from ASLR and allows additional system calls compared to the older vsyscall method. The kernel generates the vDSO shared object which is then loaded and accessed by the glibc dynamic linker to provide optimized system call implementations to applications.
This document discusses Linux memory management. It outlines the buddy system, zone allocation, and slab allocator used by Linux to manage physical memory. It describes how pages are allocated and initialized at boot using the memory map. The slab allocator is used to optimize allocation of kernel objects and is implemented as caches of fixed-size slabs and objects. Per-CPU allocation improves performance by reducing locking and cache invalidations.
The document provides an overview of the initialization process in the Linux kernel from start_kernel to rest_init. It lists the functions called during this process organized by category including functions for initialization of multiprocessor support (SMP), memory management (MM), scheduling, timers, interrupts, and architecture specific setup. The setup_arch section focuses on x86 architecture specific initialization functions such as reserving memory regions, parsing boot parameters, initializing memory mapping and MTRRs.
This document discusses various methods for allocating memory in the Linux kernel, including kmalloc, get_free_page, vmalloc, slab allocators, and memory pools. It explains that kmalloc allocates contiguous virtual and physical pages but can degrade performance for large allocations. get_free_page allocates physical pages without clearing memory. Slab allocators improve performance by grouping allocations of similar sizes. vmalloc allocates virtually contiguous regions that are not necessarily physically contiguous. Memory pools reserve memory to guarantee allocation success but can be wasteful.
The document provides an overview of the Linux boot process from power-on to starting the kernel. It discusses:
- What bootloaders like U-Boot and GRUB do to initialize hardware and load the kernel
- The differences between ARM and x86 boot processes
- How the kernel starts as PID 1 without a userspace environment
- What initrds are and how to explore one
- How to get more debug messages from the kernel boot process
- Tips for learning more using QEMU, GDB and systemd-bootchart
The document examines the boot process in detail from a low-level perspective using tools like GDB and binutils to disassemble binaries.
This document provides an overview of how to deploy a SQL Server 2019 Big Data Cluster on Kubernetes. It discusses setting up infrastructure with Ubuntu templates, installing Kubespray to manage the Kubernetes cluster lifecycle, and using azdata to deploy the Big Data Cluster. Key steps include creating an Ansible inventory, configuring storage with labels and profiles, and deploying the cluster. The document also offers tips on sizing, upgrades, and next steps like load balancing and monitoring.
CHI provides a standard interface and implementation for caching in Perl modules. It aims to improve on existing solutions like Cache::Cache by offering better performance and extensibility. CHI allows modules to easily implement caching by requesting a handle to any backend cache. It also provides a common place to implement generic caching features. Current supported backends include memory, file, memcached, and BerkeleyDB caches. Driver development is simplified through a skeleton interface.
The document provides instructions for setting up a Kubernetes cluster with one master node and one worker node on VirtualBox. It outlines the system requirements for the nodes, describes how to configure the networking and hostnames, install Docker and Kubernetes, initialize the master node with kubeadm init, join the worker node with kubeadm join, and deploy a test pod. It also includes commands to check the cluster status and remove existing Docker installations.
As part of the Google Summer of Code, we tried to add support for SeaBIOS in order to allow guest OSes to be booted directly from PV disk devices rather than from the emulated disk device. SeaBIOS is the BIOS implementation that upstream qemu uses. When the virtual machine is created, SeaBIOS upon initialization uses a generic Xenstore client to communicate with the back end and initialize the front-end block device that will connect to the back end. After the connection is established I/O requests are made via the BIOS int 0x13 interface, guest OSes use the int 0x13 without needing to be aware that PV drivers were used.
About docker cluster management tools
1. Base concepts of cluster
management and docker
2. Docker Swarm
3. Amazon EC2 Container Service
4. Kubernetes
5. Mesosphere
Vous avez récemment commencé à travailler sur Spark et vos jobs prennent une éternité pour se terminer ? Cette présentation est faite pour vous.
Himanshu Arora et Nitya Nand YADAV ont rassemblé de nombreuses bonnes pratiques, optimisations et ajustements qu'ils ont appliqué au fil des années en production pour rendre leurs jobs plus rapides et moins consommateurs de ressources.
Dans cette présentation, ils nous apprennent les techniques avancées d'optimisation de Spark, les formats de sérialisation des données, les formats de stockage, les optimisations hardware, contrôle sur la parallélisme, paramétrages de resource manager, meilleur data localité et l'optimisation du GC etc.
Ils nous font découvrir également l'utilisation appropriée de RDD, DataFrame et Dataset afin de bénéficier pleinement des optimisations internes apportées par Spark.
The document discusses several key aspects of processes and memory management in Linux:
1. A process is represented by a task_struct structure that contains information like the process ID, open files, address space, and state.
2. Each process has both a user stack and kernel stack. The kernel stack is fixed size for safety and to prevent fragmentation.
3. Process duplication is done through fork(), vfork(), and clone() system calls. Fork uses copy-on-write to efficiently duplicate the process.
4. Memory allocation for kernel structures like task_struct uses slab allocators to improve performance over the buddy allocator through object caching and reuse.
The document discusses several key aspects of processes and memory management in Linux:
1. A process is represented by a task_struct structure that contains information like the process ID, open files, address space, state, and stack.
2. Processes have both a user stack and a fixed-size kernel stack. Context switches occur when switching between these stacks for system calls or exceptions.
3. The fork() system call duplicates a process by using copy-on-write techniques to efficiently copy resources from the parent process.
4. Memory allocation for kernel objects like task_struct uses slab allocators to improve performance over the buddy allocator through object caching and reducing initialization overhead.
The document discusses various methods for capturing a kernel crash dump (vmcore) file when the Linux kernel panics or a system hangs. It describes (1) kdump and how it uses kexec to boot a capture kernel to dump memory on a panic, (2) triggering a panic manually using SysRq keys or NMI, and (3) tools for dumping memory on physical and virtual systems during a hang.
This document summarizes Lighttpd & Modcache, an event-driven web server and caching module. Lighttpd is lightweight and has a simple module structure. Modcache caches files locally and in memory to improve performance. It has advantages over Squid like being Lighttpd-based and keeping the code simple. The document provides configuration examples for Modcache and "cook books" for caching images, downloads, forums, and video sites.
Memcached is a high-performance, distributed memory caching system that is used to speed up dynamic web applications by caching objects in memory to reduce database load. It works by storing objects in memory to allow for fast retrieval, improving response times significantly. Major companies that use memcached include Facebook, Yahoo, Amazon, and LiveJournal. It provides features like consistent hashing for object distribution, multithreading, and replication.
From common errors seen in running Spark applications, e.g., OutOfMemory, NoClassFound, disk IO bottlenecks, History Server crash, cluster under-utilization to advanced settings used to resolve large-scale Spark SQL workloads such as HDFS blocksize vs Parquet blocksize, how best to run HDFS Balancer to re-distribute file blocks, etc. you will get all the scoop in this information-packed presentation.
Devoxx Fr 2022 - Remèdes aux oomkill, warm-ups, et lenteurs pour des conteneu...Jean-Philippe BEMPEL
Mes conteneurs JVM sont en prod, oups ils se font oomkill, oups le démarrage traîne en longueur, oups ils sont lent en permanence. Nous avons vécu ces situations.
Ces problèmes émergent parce qu’un conteneur est par nature un milieu restreint. Sa configuration a un impact sur le process Java, cependant ce process a lui aussi des besoins pour fonctionner.
Il y a un espace entre la heap Java et le RSS : c’est la mémoire off-heap et elle se décompose en plusieurs zones. À quoi servent-elles ? Comment les prendre en compte ?
La configuration du CPU impacte la JVM sur divers aspects : Quelles sont les influences entre le GC et le CPU ? Que choisir entre la rapidité ou la consommation CPU au démarrage ?
Au cours de cette université nous verrons comment diagnostiquer, comprendre et remédier à ces problèmes.
Delve Labs was present during the GoSec 2016 conference, where our lead DevOps engineer presented an overview of the current options available for securing Docker in production environments.
https://www.delve-labs.com
Perl Memory Use 201207 (OUTDATED, see 201209 )Tim Bunce
This document discusses Perl memory use and provides an overview of how Perl processes use and manage memory. It identifies key issues and complications with Perl memory and outlines useful tools for analyzing Perl memory usage, such as various Perl modules and Linux commands. The document focuses on the Linux operating system and dives into details of how Perl handles memory on both the process and system level, including memory mapping, page usage, and how Perl stores and manages data in memory.
The document describes deploying Cosmos DB resources using Terraform in Azure. It outlines prerequisites, environment details, and the configuration files and process used to create a resource group, Cosmos DB account, database, and collection. The main.tf file defines these resources, variables.tf contains configurable values, and output.tf displays output after deployment. Running terraform init and terraform plan commands prepares for deploying the resources.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
Odoo ERP software
Odoo ERP software, a leading open-source software for Enterprise Resource Planning (ERP) and business management, has recently launched its latest version, Odoo 17 Community Edition. This update introduces a range of new features and enhancements designed to streamline business operations and support growth.
The Odoo Community serves as a cost-free edition within the Odoo suite of ERP systems. Tailored to accommodate the standard needs of business operations, it provides a robust platform suitable for organisations of different sizes and business sectors. Within the Odoo Community Edition, users can access a variety of essential features and services essential for managing day-to-day tasks efficiently.
This blog presents a detailed overview of the features available within the Odoo 17 Community edition, and the differences between Odoo 17 community and enterprise editions, aiming to equip you with the necessary information to make an informed decision about its suitability for your business.
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
WhatsApp offers simple, reliable, and private messaging and calling services for free worldwide. With end-to-end encryption, your personal messages and calls are secure, ensuring only you and the recipient can access them. Enjoy voice and video calls to stay connected with loved ones or colleagues. Express yourself using stickers, GIFs, or by sharing moments on Status. WhatsApp Business enables global customer outreach, facilitating sales growth and relationship building through showcasing products and services. Stay connected effortlessly with group chats for planning outings with friends or staying updated on family conversations.
Takashi Kobayashi and Hironori Washizaki, "SWEBOK Guide and Future of SE Education," First International Symposium on the Future of Software Engineering (FUSE), June 3-6, 2024, Okinawa, Japan
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Microservice Teams - How the cloud changes the way we workSven Peters
A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams?
Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
What is Augmented Reality Image Trackingpavan998932
Augmented Reality (AR) Image Tracking is a technology that enables AR applications to recognize and track images in the real world, overlaying digital content onto them. This enhances the user's interaction with their environment by providing additional information and interactive elements directly tied to physical images.
E-commerce Application Development Company.pdfHornet Dynamics
Your business can reach new heights with our assistance as we design solutions that are specifically appropriate for your goals and vision. Our eCommerce application solutions can digitally coordinate all retail operations processes to meet the demands of the marketplace while maintaining business continuity.
1. * Based on kernel 5.11 (x86_64) – QEMU
* 2-socket CPUs (4 cores/socket)
* 16GB memory
* Kernel parameter: nokaslr norandmaps
* KASAN: disabled
* Userspace: ASLR is disabled
* Legacy BIOS
Slab Allocator in Linux Kernel
Adrian Huang | Aug, 2022
2. Agenda
• Slab & Buddy System
• Slab Concept
o “Chicken or the egg” problem
o Will not focus the differences about slab/slub/slob: Lots of info about them in Internet.
o Note: SLUB is the default allocator since 2.6.23.
• [Init & Creation] kmem_cache_init(): Let’s check the initialization detail
o How to initialize ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’?
o What are freepointers?
o Data structures: kmem_cache & kmem_cache_node
o [Comparison] Generic kmem_cache_create() & low-level implementation __kmem_cache_create()
• [Cache allocation] slab_alloc_node – Allocate a slab cache with specific node id
o Fast path & slow path
Will follow kmem_cache_init() flow
• [Cache release/free] kmem_cache_free()
o Fast path & slow path
• kmalloc & slab
3. Slab & Buddy System
Buddy System
alloc_page(s), __get_free_page(s)
Slab Allocator
kmem_cache_alloc/kmem_cache_free
kmalloc/kfree
glibc: malloc/free
brk/mmap
. . .
vmalloc
User Space
Kernel Space
Hardware
• Balance between brk() and mmap()
• Use brk() if request size < DEFAULT_MMAP_THRESHOLD_MIN (128 KB)
o The heap can be trimmed only if memory is freed at the top end.
o sbrk() is implemented as a library function that uses the brk() system call.
• Use mmap() if request size >= DEFAULT_MMAP_THRESHOLD_MIN (128 KB)
o The allocated memory blocks can be independently released back to the system.
o Deallocated space is not placed on the free list for reuse by later allocations.
o Memory may be wasted because mmap allocations must be page-aligned; and the
kernel must perform the expensive task of zeroing out memory allocated.
o Note: glibc uses the dynamic mmap threshold
o Detail: `man mallopt`
[glibc] malloc
4. Slab Concept
Slab Allocator
Slab Cache
(page order = 0)
Slab Cache
(page order = 1)
Page #0
slab object
slab object
slab object
slab object
…
Page #N
slab object
slab object
slab object
slab object
slab object
slab object
slab object
slab object
4 objects
per page
Page #1
slab object
…
Page #N
2 objects
per page
slab object
slab object
slab object
slab object
slab object
.
.
.
Memory (object)
allocation
* object = data structure
slab_caches
global list_head
slab
…
slab
slab
…
Page #0
slab object
slab object
1. Allocating/freeing data structures is the common operations in Linux kernel
2. Slab is a generic data structure-caching layer
10. [Init & Creation] kmem_cache_init():
Let’s check the initialization detail
• How to initialize ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’?
• What are freepointers?
• Data structures: kmem_cache & kmem_cache_node
• [Comparison] Generic kmem_cache_create() & low-level implementation
__kmem_cache_create()
* Call paths are based on SLUB (the unqueued slab allocator)
11. kmem_cache_init()
start_kernel
1. Declare two static variables ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’
2. Assign the addresses of two static variables to generic/global variables
kmem_cache_node = &boot_kmem_cache_node
kmem_cache = &boot_kmem_cache
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
mm_init kmem_cache_init
create_boot_cache(kmem_cache_node, …)
create_boot_cache(kmem_cache, …)
kmem_cache =
bootstrap(&boot_kmem_cache)
kmem_cache_node =
bootstrap(&boot_kmem_cache_node)
setup_kmalloc_cache_index_table
create_kmalloc_caches
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
Calculate page order and objects per slab
12. kmem_cache_node/boot_kmem_cache_node init
kmem_cache
slab Page #0
slab object #0
…
…
…
…
…
…
slab object #63
64 objects
per page
Slub allocator: A generic data structure-cache layer
* object = data structure
object size = sizeof(struct kmem_cache_node) = 64
object size with alignment (SLAB_HWCACHE_ALIGN: 64) = 64
calculate_sizes (): order = 0
16. init_kmem_cache_nodes()
n = kmem_cache_alloc_node()
early_kmem_cache_node_alloc
init_kmem_cache_nodes
for_each_node_state()
slab_state == DOWN
init_kmem_cache_node
kmem_cache->node[node] = n
Y
N
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
Calculate page order and objects per slab
create_boot_cache(kmem_cache_node, …)
Special (one-shot) path:
Fix ‘chicken or the egg’ problem
Generic path
17. early_kmem_cache_node_alloc for ‘kmem_cache_node’
allocate_slab
page = new_slab(kmem_cache_node, …)
early_kmem_cache_node_alloc
page = alloc_slab_page()
page->slab_cache = s
Update freepointers for
the allocated pages
n = page->freelist
kmem_cache_node->node[node] = n
page->freelist =
get_freepointer(kmem_cache_node, n)
init_kmem_cache_node(n)
__add_partial(n, page, DEACTIVATE_TO_HEAD)
inc_slabs_node
18. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 64
objects:15 = 64
frozen: 1 = 1
union
struct
(SLUB)
union
struct
(Partial
Page)
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
allocate_slab() – freelist/freepointer configuration
offset = 32
slab
object
slab
object
..
slab
object
Page
boot_kmem_cache_node
static local variable
Reference functions: get_freepointer/set_freepointer
allocate_slab
page = new_slab(kmem_cache_node, …)
early_kmem_cache_node_alloc
page = alloc_slab_page()
page->slab_cache = s
Update freepointers for
the allocated pages
n = page->freelist
kmem_cache_node->node[node] = n
page->freelist =
get_freepointer(kmem_cache_node, n)
init_kmem_cache_node(n)
__add_partial(n, page, DEACTIVATE_TO_HEAD)
inc_slabs_node
19. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 64
objects:15 = 64
frozen: 1 = 1
union
struct
(SLUB)
union
struct
(Percpu
Partial
Page)
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
allocate_slab() – freelist/freepointer configuration
offset = 32
slab
object
slab
object
..
slab
object
Page
Page address = X
X
X + kmem_cache->size
X + kmem_cache->size
slab object #0
X + kmem_cache->size * 2
slab object #1
freepointer in slab object: Point to the next slab object
(address: middle of object avoid buffer overflow or underflow)
NULL
slab object #n -1
point to the next
free slab object
.
.
.
X + kmem_cache->offset
boot_kmem_cache_node
static local variable
Reference functions: get_freepointer/set_freepointer
20. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 64 1
objects:15 = 64
frozen: 1 = 1 0
union
struct
(SLUB)
union
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
early_kmem_cache_node_alloc() – Get a slab object
from freelist
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial
struct list_head partial
nr_slabs
total_objects
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
static local variable
boot_kmem_cache_node
struct
(Percpu
Partial
Page)
21. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
early_kmem_cache_node_alloc() – Get a slab object
from freelist
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 0
struct list_head partial
nr_slabs = 0
total_objects = 0
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
static local variable
boot_kmem_cache_node
struct
(Percpu
Partial
Page)
22. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
early_kmem_cache_node_alloc() – Get a slab object
from freelist
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 0
struct list_head partial
nr_slabs = 0 1
total_objects = 0 64
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
static local variable
boot_kmem_cache_node
struct
(Percpu
Partial
Page)
23. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Percpu
Partial
Page)
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
early_kmem_cache_node_alloc() – Get a slab object
from freelist
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 0 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
static local variable
boot_kmem_cache_node
list_head
24. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Percpu
Partial
Page)
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
early_kmem_cache_node_alloc(): configure node #0
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
list_head
CONFIG_SLUB_DEBUG = y
static local variable
boot_kmem_cache_node
Allocate slab page(s) for kmem_cache_node struct allocation: Fix “the chicken or egg problem”
25. page (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Partial
Page)
kmem_cache
(kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
list_head
CONFIG_SLUB_DEBUG = y
early_kmem_cache_node_alloc(): configure node #0
26. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Percpu
Partial
Page)
kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab = NULL
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
list_head
CONFIG_SLUB_DEBUG = y
static local variable
boot_kmem_cache_node
early_kmem_cache_node_alloc(): configure node #0
27. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Percpu
Partial
Page)
kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab = NULL
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
list_head
CONFIG_SLUB_DEBUG = y
kmem_cache_node
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
.
.
static local variable
boot_kmem_cache_node
early_kmem_cache_node_alloc(): Fully-configured
28. page struct (slab, slob or slub)
struct list_head slab_list
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Percpu
Partial
Page)
kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab = NULL
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
slab
object
slab
object
..
slab
object
Page
offset = 32
kmem_cache_node
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
Get a slab object from page->freelist
kmem_cache_node->node[node] = n
list_head
kmem_cache_node
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
.
.
static local variable
boot_kmem_cache_node
early_kmem_cache_node_alloc(): Fully-configured
29. [Comparison] Generic kmem_cache_create() & low-level implementation
__kmem_cache_create()
start_kernel
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
mm_init kmem_cache_init
create_boot_cache(kmem_cache_node, …)
create_boot_cache(kmem_cache, …)
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
Calculate page order and objects per slab
kmem_cache_create
kmem_cache_create_usercopy
__kmem_cache_alias
create_cache
find_mergeable
kmem_cache_zalloc slab_alloc_node
slab_alloc
kmem_cache_alloc
__kmem_cache_create
list_add(&s->list, &slab_caches)
Allocate a ‘kmem_cache’ struct from
a global variable ‘kmem_cache’
30. [Cache allocation] slab_alloc_node –
Allocate a slab cache with specific node id
• Fast path & slow path
o Will follow kmem_cache_init() flow
31. kmem_cache_init()
start_kernel
1. Declare two static variables ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’
2. Assign the addresses of two static variables to generic/global variables
kmem_cache_node = &boot_kmem_cache_node
kmem_cache = &boot_kmem_cache
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
mm_init kmem_cache_init
create_boot_cache(kmem_cache_node, …)
create_boot_cache(kmem_cache, …)
kmem_cache =
bootstrap(&boot_kmem_cache)
kmem_cache_node =
bootstrap(&boot_kmem_cache_node)
setup_kmalloc_cache_index_table
create_kmalloc_caches
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
32. kmem_cache
(global variable: kmem_cache)
__percpu *cpu_slab
min_partial = 5
size = 256
object_size = 216
cpu_partial = 13
min = 0x10
inuse = 216
align = 64
name = “kmem_cache”
list
*node[MAX_NUMNODES]
max = 0x10020
oo (order objects) = 0x10020
kmem_cache: init_kmem_cache_nodes()
offset = 112
kmem_cache_node
nr_partial = 0
struct list_head partial
nr_slabs = 0
total_objects = 0
list_lock
struct list_head full
kmem_cache_node->node[node] = n
CONFIG_SLUB_DEBUG = y
kmem_cache_node
nr_partial = 0
struct list_head partial
nr_slabs = 0
total_objects = 0
list_lock
struct list_head full
.
.
kmem_cache_cpu
void **freelist
tid (transaction id) = cpu id
struct page *page
struct page *partial
stat[NR_SLUB_STAT_ITEMS]
boot_kmem_cache
static variable
Allocate from global variable ‘kmem_cache_node’
* init_kmem_cache_nodes -> kmem_cache_alloc_node
n = kmem_cache_alloc_node()
early_kmem_cache_node_alloc
init_kmem_cache_nodes
for_each_node_state()
slab_state == DOWN
init_kmem_cache_node
kmem_cache->node[node] = n
Y
N
Special (one-shot) path:
Fix ‘chicken or the egg’ problem
Generic path
33. Get a slab object from c->freelist
kmem_cache_alloc_node
!c->freelist or
!c-> page
kmem_cache_alloc_node/slab_alloc_node
__slab_alloc
kmem_cache_alloc slab_alloc slab_alloc_node
NUMA_NO_NODE
__kmalloc_node
kmalloc_node node_id
node_id
___slab_alloc
new_slab_objects
c->page = slub_percpu_partial(c)
Get a slab object from page->freelist
c->page &&
!c->freelist
!c->page && c->partial
get_partial
Get partial page from percpu cache
new_slab allocate_slab
alloc_slab_page
Allocate pages from
Buddy system
!c->page && !c->partial
Get partial page from node cache
(kmem_cache_node)
Y
N
fast path
slow path get_freelist -> __cmpxchg_double_slab
this_cpu_cmpxchg_double()
35. Get a slab object from c->freelist
kmem_cache_alloc_node
!c->freelist or
!c-> page
slab_alloc_node(): slowpath #1 – Get an object from node’s partial page
__slab_alloc
kmem_cache_alloc slab_alloc slab_alloc_node
NUMA_NO_NODE
__kmalloc_node
kmalloc_node node_id
node_id
___slab_alloc
new_slab_objects
c->page = slub_percpu_partial(c)
c->page &&
!c->freelist
!c->page && c->partial
get_partial
Get partial page from percpu cache
new_slab allocate_slab
alloc_slab_page
Allocate pages from
Buddy system
!c->page && !c->partial
Get partial page from node cache
(kmem_cache_node)
Y
N
fast path
slow path get_freelist -> __cmpxchg_double_slab
this_cpu_cmpxchg_double()
Get a slab object from page->freelist
36. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page = NULL
void **freelist = NULL
struct page *partial = NULL
Get a page (slab) from kmem_cache_node
kmem_cache_node (node 0)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
2
3
Get a page (slab) from ‘partial’ member
4
Move node’s partial page
(slab) to percpu cache
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Partial
Page)
list_head
struct list_head slab_list
slab
object
slab
object
..
slab
object
Page
slab
object
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #0
1
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Allocate slab objects
for node[0] & node[1]
Slowpath #1 – Get an object from node’s partial page
37. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page = NULL
void **freelist = NULL
struct page *partial = NULL
Get a page (slab) from kmem_cache_node
kmem_cache_node (node 0)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
2
3
Get a page (slab) from ‘partial’ member
4
Move node’s partial page
(slab) to percpu cache
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Partial
Page)
list_head
struct list_head slab_list
slab
object
slab
object
..
slab
object
Page
slab
object
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #0
1
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Allocate slab objects
for node[0] & node[1]
Slowpath #1 – Get an object from node’s partial page
38. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page = NULL
void **freelist = NULL
struct page *partial = NULL
Get a page (slab) from kmem_cache_node
kmem_cache_node (node 0)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
2
3
Get a page (slab) from ‘partial’ member
4
Move node’s partial page
(slab) to percpu cache
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Partial
Page)
list_head
struct list_head slab_list
slab
object
slab
object
..
slab
object
Page
slab
object
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #0
1
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Allocate slab objects
for node[0] & node[1]
Slowpath #1 – Get an object from node’s partial page
39. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page = NULL
void **freelist = NULL
struct page *partial = NULL
Get a page (slab) from kmem_cache_node
kmem_cache_node (node 0)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
2
3
Get a page (slab) from ‘partial’ member
4
Move node’s partial page
(slab) to percpu cache
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 64
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Partial
Page)
list_head
struct list_head slab_list
slab
object
slab
object
..
slab
object
Page
slab
object
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #0
1
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Allocate slab objects
for node[0] & node[1]
Slowpath #1 – Get an object from node’s partial page
40. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page
void **freelist
struct page *partial = NULL
kmem_cache_node (node 0)
nr_partial = 0
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #0
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Return this
object address
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object) = NULL
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 64
objects:15 = 64
frozen: 1 = 1
union
struct
(SLUB)
union
struct
(Partial
Page)
struct list_head slab_list = invalid
slab
object
slab
object
..
slab
object
Page
slab
object
Allocate slab objects
for node[0] & node[1]
• node[0]: allocated
Slowpath #1 – Get an object from node’s partial page
41. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page
void **freelist
struct page *partial = NULL
kmem_cache_node (node 0)
nr_partial = 0
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #0
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Allocate slab objects
for node[0] & node[1]
• node[0]: allocated
Return this
object address
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object) = NULL
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 64
objects:15 = 64
frozen: 1 = 1
union
struct
(SLUB)
union
struct
(Partial
Page)
struct list_head slab_list = invalid
slab
object
slab
object
..
slab
object
Page
slab
object
Slowpath #1 – Get an object from node’s partial page
42. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page
void **freelist
struct page *partial = NULL
kmem_cache_node (node 0)
nr_partial = 0
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #1
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Allocate slab objects
for node[0] & node[1]
• node[0]: allocated
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object) = NULL
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 64
objects:15 = 64
frozen: 1 = 1
union
struct
(SLUB)
union
struct
(Partial
Page)
struct list_head slab_list = invalid
slab
object
slab
object
..
slab
object
Page
slab
object
1
2 percpu page’s node
is mismatch
3
deactivate_slab(): Remove percpu slab
Slowpath #1 – Get an object from node’s partial page
43. kmem_cache
(global variable: kmem_cache_node)
__percpu *cpu_slab
min_partial = 5
size = 64
object_size = 64
cpu_partial = 30
min = 64
inuse = 64
align = 64
name =
“kmem_cache_node”
list
*node[MAX_NUMNODES]
max = 64
oo (order objects) = 64
offset = 32
static local variable
boot_kmem_cache_node
kmem_cache_cpu
(CPU #N)
struct page *page
void **freelist
struct page *partial = NULL
kmem_cache_node (node 0)
nr_partial = 0
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 64
list_lock
struct list_head full
*node[MAX_NUMNODES]
page struct (slab, slob or slub): node #1
Allocate a slab object
from node #1
kmem_cache
(kmem_cache)
*node[]
boot_kmem_cache
static local variable
Allocate slab objects
for node[0] & node[1]
• node[0]: allocated
page struct (slab, slob or slub): node #0
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object) = NULL
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 64
objects:15 = 64
frozen: 1 = 1
union
struct
(SLUB)
union
struct
(Partial
Page)
struct list_head slab_list = invalid
slab
object
slab
object
..
slab
object
Page
slab
object
1
2 percpu page’s node
is mismatch
3
deactivate_slab(): Remove percpu slab
Slowpath #1 – Get an object from node’s partial page
48. kmem_cache_init()
start_kernel
1. Declare two static variables ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’
2. Assign the addresses of two static variables to generic/global variables
kmem_cache_node = &boot_kmem_cache_node
kmem_cache = &boot_kmem_cache
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
mm_init kmem_cache_init
create_boot_cache(kmem_cache_node, …)
create_boot_cache(kmem_cache, …)
kmem_cache =
bootstrap(&boot_kmem_cache)
kmem_cache_node =
bootstrap(&boot_kmem_cache_node)
setup_kmalloc_cache_index_table
create_kmalloc_caches
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
Let’s check data structures about kmem_cache & kmem_cache_node after
function call “create_boot_cache”
51. kmem_cache_init()
start_kernel
1. Declare two static variables ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’
2. Assign the addresses of two static variables to generic/global variables
kmem_cache_node = &boot_kmem_cache_node
kmem_cache = &boot_kmem_cache
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
mm_init kmem_cache_init
create_boot_cache(kmem_cache_node, …)
create_boot_cache(kmem_cache, …)
kmem_cache =
bootstrap(&boot_kmem_cache)
kmem_cache_node =
bootstrap(&boot_kmem_cache_node)
setup_kmalloc_cache_index_table
create_kmalloc_caches
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
52. bootstrap
s = kmem_cache_zalloc(kmem_cache, …)
Allocate a kmem_cache struct from global variable ‘kmem_cache’: Generic path
__flush_cpu_slab
list_add(&s->list, &slab_caches)
for_each_kmem_cache_node(s, node, n)
kmem_cache = bootstrap(&boot_kmem_cache)
memcpy(s, static_cache, …)
return s
list_for_each_entry(p, &n->partial, slab_list)
struct page *p;
p->slab_cache = s
53. Get a slab object from c->freelist
kmem_cache_alloc_node
!c->freelist or
!c-> page
slab_alloc_node(): slowpath #2 – Allocate a page from Buddy system
__slab_alloc
kmem_cache_alloc slab_alloc slab_alloc_node
NUMA_NO_NODE
__kmalloc_node
kmalloc_node node_id
node_id
___slab_alloc
new_slab_objects
c->page = slub_percpu_partial(c)
c->page &&
!c->freelist
!c->page && c->partial
get_partial
Get partial page from percpu cache
new_slab allocate_slab
alloc_slab_page
Allocate pages from
Buddy system
!c->page && !c->partial
Get partial page from node cache
(kmem_cache_node)
Y
N
fast path
slow path get_freelist -> __cmpxchg_double_slab
this_cpu_cmpxchg_double()
bootstrap
s = kmem_cache_zalloc(kmem_cache, …)
Allocate a kmem_cache struct from global variable ‘kmem_cache’
Get a slab object from page->freelist
58. kmem_cache
(global variable: kmem_cache)
__percpu *cpu_slab
min_partial = 5
size = 256
object_size = 216
cpu_partial = 13
min = 0x10
inuse = 216
align = 64
name = “kmem_cache”
list
*node[MAX_NUMNODES]
max = 0x10020
oo (order objects) = 0x10020
offset = 112
kmem_cache_cpu
(CPU #N)
struct page *page = NULL
void **freelist = NULL
struct page *partial = NULL
kmem_cache_node (node 0)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 32
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 0
struct list_head partial
nr_slabs = 0
total_objects = 0
list_lock
struct list_head full
*node[MAX_NUMNODES]
Page order objects per slab
0
15
16
oo (order objects)
page struct (slab, slob or slub): order =1
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 32
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Partial
Page)
struct list_head slab_list
slab
object
slab
object
..
slab
object
Slab Page(s)
slab
object
freepointers:
reverse!
kmem_cache = bootstrap(&boot_kmem_cache): after function returns
static local variable
boot_kmem_cache
kmem_cache
global variable
X
59. kmem_cache
(global variable: kmem_cache)
__percpu *cpu_slab
min_partial = 5
size = 256
object_size = 216
cpu_partial = 13
min = 0x10
inuse = 216
align = 64
name = “kmem_cache”
list
*node[MAX_NUMNODES]
max = 0x10020
oo (order objects) = 0x10020
offset = 112
kmem_cache_cpu
(CPU #N)
struct page *page = NULL
void **freelist = NULL
struct page *partial = NULL
kmem_cache_node (node 0)
nr_partial = 1
struct list_head partial
nr_slabs = 1
total_objects = 32
list_lock
struct list_head full
kmem_cache_node (node 1)
nr_partial = 0
struct list_head partial
nr_slabs = 0
total_objects = 0
list_lock
struct list_head full
*node[MAX_NUMNODES]
Page order objects per slab
0
15
16
oo (order objects)
page struct (slab, slob or slub): order =1
struct page *next
struct kmem_cache *slab_cache
void *freelist (first free object)
s_mem: first object - SLAB
int pages
int pobjects
counters - SLUB
inuse: 16 = 1
objects:15 = 32
frozen: 1 = 0
union
struct
(SLUB)
union
struct
(Partial
Page)
struct list_head slab_list
slab
object
slab
object
..
slab
object
Slab Page(s)
slab
object
freepointers:
reverse!
kmem_cache = bootstrap(&boot_kmem_cache): after function returns
static local variable
boot_kmem_cache
kmem_cache
global variable
X
60. kmem_cache_init()
start_kernel
1. Declare two static variables ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’
2. Assign the addresses of two static variables to generic/global variables
kmem_cache_node = &boot_kmem_cache_node
kmem_cache = &boot_kmem_cache
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
mm_init kmem_cache_init
create_boot_cache(kmem_cache_node, …)
create_boot_cache(kmem_cache, …)
kmem_cache =
bootstrap(&boot_kmem_cache)
kmem_cache_node =
bootstrap(&boot_kmem_cache_node)
setup_kmalloc_cache_index_table
create_kmalloc_caches
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
61. Get a slab object from c->freelist
kmem_cache_alloc_node
!c->freelist or
!c-> page
slab_alloc_node(): slowpath #1 – Get an object from node’s partial page
__slab_alloc
kmem_cache_alloc slab_alloc slab_alloc_node
NUMA_NO_NODE
__kmalloc_node
kmalloc_node node_id
node_id
___slab_alloc
new_slab_objects
c->page = slub_percpu_partial(c)
c->page &&
!c->freelist
!c->page && c->partial
get_partial
Get partial page from percpu cache
new_slab allocate_slab
alloc_slab_page
Allocate pages from
Buddy system
!c->page && !c->partial
Get partial page from node cache
(kmem_cache_node)
Y
N
fast path
slow path get_freelist -> __cmpxchg_double_slab
this_cpu_cmpxchg_double()
bootstrap
s = kmem_cache_zalloc(kmem_cache, …)
Allocate a kmem_cache struct from global variable ‘kmem_cache’
kmem_cache_node = bootstrap(&boot_kmem_cache_node)
Get a slab object from page->freelist
68. kmem_cache_init()
start_kernel
1. Declare two static variables ‘boot_kmem_cache’ and ‘boot_kmem_cache_node’
2. Assign the addresses of two static variables to generic/global variables
kmem_cache_node = &boot_kmem_cache_node
kmem_cache = &boot_kmem_cache
__kmem_cache_create init_kmem_cache_nodes
calculate_sizes
kmem_cache_open
alloc_kmem_cache_cpus
mm_init kmem_cache_init
create_boot_cache(kmem_cache_node, …)
create_boot_cache(kmem_cache, …)
kmem_cache =
bootstrap(&boot_kmem_cache)
kmem_cache_node =
bootstrap(&boot_kmem_cache_node)
setup_kmalloc_cache_index_table
create_kmalloc_caches
__alloc_percpu
init_kmem_cache_cpus
Assign cpu id to
kmem_cache_cpu->tid
Allocate/init ‘kmem_cache_node’
structs for all CPU nodes (sockets)
Allocate/init a ‘kmem_cache_cpu’ struct
Will talk about this later
69. Get a slab object from c->freelist
kmem_cache_alloc_node
!c->freelist or
!c-> page
slab_alloc_node(): slowpath #3 – Get a partial page from percpu cache
__slab_alloc
kmem_cache_alloc slab_alloc slab_alloc_node
NUMA_NO_NODE
__kmalloc_node
kmalloc_node node_id
node_id
___slab_alloc
new_slab_objects
c->page = slub_percpu_partial(c)
c->page &&
!c->freelist
!c->page && c->partial
get_partial
Get partial page from percpu cache
new_slab allocate_slab
alloc_slab_page
Allocate pages from
Buddy system
!c->page && !c->partial
Get partial page from node cache
(kmem_cache_node)
Y
N
fast path
slow path get_freelist -> __cmpxchg_double_slab
this_cpu_cmpxchg_double()
Get a slab object from page->freelist
Let’s talk about another slow path
70. kmem_cache
__percpu *cpu_slab
*node[MAX_NUMNODES]
kmem_cache_cpu
(CPU #N)
struct page *page = NULL or valid
void **freelist = NULL or valid
struct page *partial
kmem_cache_node
(node 0)
struct list_head partial
kmem_cache_node
(node 1)
struct list_head partial
*node[MAX_NUMNODES]
slowpath #3: Get a partial page from percpu cache
slab
object
slab
object
..
slab
object
Page
page->next
Slowpath #3: Get a partial page from percpu cache
slab
object
slab
object
..
slab
object
Page
page->freelist page->freelist
1
71. kmem_cache
__percpu *cpu_slab
*node[MAX_NUMNODES]
kmem_cache_cpu
(CPU #N)
struct page *page = NULL
void **freelist = NULL
struct page *partial = NULL
kmem_cache_node
(node 0)
struct list_head partial
kmem_cache_node
(node 1)
struct list_head partial
*node[MAX_NUMNODES]
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
Slowpath #1: Get a slab object from node’s partial (new_slab_objects->get_partial-
>get_partial_node->put_cpu_partial)
• [Just for a scenario] Acquire 3 pages because of the number of kmem_cache->cpu_partial
1
When to add a page to percpu partial page? – Case 1 (1/2)
put_cpu_partial(): Put page(s) into a percpu partial page
72. kmem_cache
__percpu *cpu_slab
*node[MAX_NUMNODES]
kmem_cache_cpu
(CPU #N)
struct page *page
void **freelist
struct page *partial
kmem_cache_node
(node 0)
struct list_head partial
kmem_cache_node
(node 1)
struct list_head partial
*node[MAX_NUMNODES]
When to add a page to percpu partial page? – Case 1 (2/2)
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
page->next
slab
object
slab
object
..
slab
object
Page
1
slab
object
slab
object
..
slab
object
Page
page->freelist page->freelist
Slowpath #1: Get a slab object from node’s partial (new_slab_objects->get_partial-
>get_partial_node->put_cpu_partial)
• [Just for a scenario] Acquire 3 pages because of the number of kmem_cache->cpu_partial
put_cpu_partial(): Put page(s) into a percpu partial page
74. kmem_cache
__percpu *cpu_slab
*node[MAX_NUMNODES]
kmem_cache_cpu
(CPU #N)
struct page *page
void **freelist
struct page *partial
kmem_cache_node
(node 0)
struct list_head partial
kmem_cache_node
(node 1)
struct list_head partial
*node[MAX_NUMNODES]
free this object
slab
object
slab
object
..
slab
object
Page
page->next
put_cpu_partial(): Move this
page to percpu partial
2
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
..
slab
object
slab
object
..
slab
object
Page
1
Fully-allocated objects of pages
When to add a page to percpu partial page? – Case 2: slab_free()
slab_free(): Move page(s) to percpu partial and node partial (if possible)
* Detail will be discussed in section “[Cache release/free] kmem_cache_free()”
75. Get a slab object from c->freelist
kmem_cache_alloc_node
!c->freelist or
!c-> page
slab_alloc_node(): slowpath #4 – Get a slab object page from page->freelist
__slab_alloc
kmem_cache_alloc slab_alloc slab_alloc_node
NUMA_NO_NODE
__kmalloc_node
kmalloc_node node_id
node_id
___slab_alloc
new_slab_objects
c->page = slub_percpu_partial(c)
c->page &&
!c->freelist
!c->page && c->partial
get_partial
Get partial page from percpu cache
new_slab allocate_slab
alloc_slab_page
Allocate pages from
Buddy system
!c->page && !c->partial
Get partial page from node cache
(kmem_cache_node)
Y
N
fast path
slow path get_freelist -> __cmpxchg_double_slab
this_cpu_cmpxchg_double()
Get a slab object from page->freelist
82. Return the slab object to c->freelist
kmem_cache_free
page == c->page?
kmem_cache_free()
__slab_free cmpxchg_double_slab
N
fast path: percpu cache
slow path: node cache
this_cpu_cmpxchg_double()
slab_free do_slab_free
put_cpu_partial
add_partial
remove_partial
discard_slab __free_slab __free_pages
Return the slab object
to page->freelist
Slowpath#3: Slab objects in page slab
are all free: free to Buddy system
Y
Slowpath#1: Move page(s) to percpu
partial or node partial (if possible)
Slowpath#2: Move pages to node partial
83. kmem_cache
__percpu *cpu_slab
*node[MAX_NUMNODES]
kmem_cache_cpu
(CPU #N)
struct page *page
void **freelist
struct page *partial
kmem_cache_node
(node 0)
struct list_head partial
kmem_cache_node
(node 1)
struct list_head partial
*node[MAX_NUMNODES]
slowpath #1:put_cpu_partial() – Case 1: Move page(s) to percpu partial
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
slab
object
slab
object
..
slab
object
Page
..
slab
object
slab
object
..
slab
object
Page
Fully-allocated objects of pages
free this object
slab
object
slab
object
..
slab
object
Page
page->next
put_cpu_partial(): Move this
page to percpu partial
1
2
__slab_free cmpxchg_double_slab
slow path: node cache
put_cpu_partial
add_partial
remove_partial
discard_slab __free_slab __free_pages
Return the slab object
to page->freelist
Slowpath#3: Slab objects in page slab
are all free: free to Buddy system
Slowpath#1: Move page(s) to percpu
partial and node partial (if possible)
Slowpath#2: Move pages to node partial