SlideShare une entreprise Scribd logo
1  sur  62
SMP Implementation
 for OpenBSD/sgi
  Takuya ASADA<syuu@openbsd.org>
Introduction
•   I was working to add SMP & 64bit support to
    a BSD-based embedded OS at
    The target device was MIPS64
•   There’s no complete *BSD/MIPS SMP
    implementation at that time, so I implemented it
•   The implementation was proprietary, but I wanted
    to contribute something to BSD community
•   I decided to implement SMP from scratch, tried to
    find a suitable MIPS SMP machine
Finding MIPS/SMP
       Machine(1)
• Broadcom SiByte BCM1250 looks nice -
  2core 1GHz MIPS64, DDR DRAM, GbE,
  PCI, HT
• Cisco 3845 integrated this processor
• How much is it?
Finding MIPS/SMP
       Machine(1)
• Broadcom SiByte BCM1250 looks nice -
  2core 1GHz MIPS64, DDR DRAM, GbE,
  PCI, HT
• Cisco 3845 integrated this processor
• How much is it?
        from $14,200!
     Totally unacceptable!!!
Finding MIPS/SMP
       Machine(2)
• Some antique SGI machines are available on
  eBay
• These are basically very cheap
• I realized SGI Octane 2 has 2 cores, already
  supported by OpenBSD
Finding MIPS/SMP
          Machine(2)
   • Some antique SGI machines are available on
     eBay
   • These are basically very cheap
   • I realized SGI Octane 2 has 2 cores, already
     supported by OpenBSD
           Just $33!
Do you remember how much was it?
This is my Octane2
Processors MIPS R12000 400MHz x2


 Memory         1GB SDRAM

Graphics     3D Graphics Card

 Sound     Integrated Digital Audio

 Storage      35GB SCSI HDD

Ethernet         100BASE-T
Become an OpenBSD
    developer
• I started working on Octane2 since Apr
  2009, wrote about on my blog
• Miod Vallat discovered it, suggested my
  code be merged into OpenBSD main tree
• I become an OpenBSD developer in Sep
  2009, started merging the code, joined
  hackathon, worked with Miod and Joel Sing
NOW IT WORKS!!!!

• Merged into OpenBSD main tree
• You can try it now!
• It seems stable -
  I tried “make build” again and again over a
  day, it keeps working
Demo
Performance -
 kernel build benchmark
 '(&!!"   4% overhead
 '(%!!"
 '($!!"
 '(#!!"                     46% speed up
 '(!!!"
                                           ,-.-/01)02+!"
  &!!"
                                           ,-.-/01)02+!342"
  %!!"
  $!!"
  #!!"
     !"
(sec)        ")*'"      ")*#"      ")*+"
Performance -
 OpenBSD/amd64 for comparison
 &!!"

 %#!"
        2% overhead

 %!!"
                              44% speed up
 $#!"                                        )*+*,-."
                                             )*+*,-./01"
 $!!"

  #!"

   !"
           "'($"      "'(%"          "'(&"
(sec)
What did we need to
    do for the SMP work
There were lots of works...
•   Support multiple cpu_info      •   Secondary processor entry
    and processor related              point
    macros
                                   •   IPI: Inter-Processor Interrupt
•   Move per-processor data into
    cpu_info                       •   Per-processor ASID
                                       management
•   Implement lock primitives
                                   •   TLB shootdown
•   Acquiring giant lock
                                   •   Lazy FPU handling
•   Implement atomic operations
                                   •   Per-processor clock
•   Spin up secondary
    processors
Describe it more
        simply
We can classify tasks into three kind of
problems:
• Restructuring per-processor informations
• Implement and use lock/atomic primitives
• Hardware related problems
Restructuring per-
processor informations
•   In the original sgi port, some informations which
    related the processor are allocated only one
•   Simple case:
    Information is stored into global variable
    Move it into “cpu_info”, per-processor structure
•   Complex case:
    In pmap, we need to maintain some information
    per-processor * per-process
Simple case -
       clock.c: original code
// defined as global variables
       u_int32_t cpu_counter_last;
       u_int32_t cpu_counter_interval;
       u_int32_t pendingticks;

uint32_t clock_int5(uint32_t mask, struct trap_frame *tf)
...
       clkdiff = cp0_get_count() - cpu_counter_last;
       while (clkdiff >= cpu_counter_interval) {
               cpu_counter_last += cpu_counter_interval;
               clkdiff = cp0_get_count() - cpu_counter_last;
               pendingticks++;
Simple case -
      clock.c: modified code

uint32_t clock_int5(uint32_t mask, struct trap_frame *tf)
...
       struct cpu_info *ci = curcpu();
       clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
       while (clkdiff >= ci->ci_cpu_counter_interval) {
               ci->ci_cpu_counter_last +=
                   ci->ci_cpu_counter_interval;
               clkdiff =
                  cp0_get_count() - ci->ci_cpu_counter_last;
               ci->ci_pendingticks++;
Complex case: pmap
•   MIPS TLB entries are tagged with 8bit process id called ASID,
    used for improve performance
    MMU skips different process TLB entries on lookup
    We won’t need to flush TLB every context switches
•   Need to maintain process:ASID assign information
    because it’s smaller than PID, we need to rotate it
•   The information should keep beyond context switch
•   We maintain ASID individually per-processor


•   What we need is:
    ASID information per-processor * per-process
Complex case - pmap.c:
     original and modified code
uint pmap_alloc_tlbpid(struct proc *p)
...
               tlbpid_cnt = id + 1;
               pmap->pm_tlbpid = id;




uint pmap_alloc_tlbpid(struct proc *p)
...
               tlbpid_cnt[cpuid] = id + 1;
               pmap->pm_tlbpid[cpuid] = id;
Implement and use
lock/atomic primitives
•   Needed to implement
    •   lock primitives: mutex, mp_lock
    •   atomic primitives: CAS, 64bit add, etc..
•   Acquiring giant lock prior to entering the kernel
    context
    •   hardware interrupts
    •   software interrupts
    •   trap()
Acquiring giant lock on
    clock interrupt handler
uint32_t clock_int5(uint32_t mask, struct trap_frame *tf)
...
       if (tf->ipl < IPL_CLOCK) {
#ifdef MULTIPROCESSOR
                __mp_lock(&kernel_lock);
#endif
                while (ci->ci_pendingticks) {
                        clk_count.ec_count++;
                        hardclock(tf);
                        ci->ci_pendingticks--;
                }
#ifdef MULTIPROCESSOR
                __mp_unlock(&kernel_lock);
#endif
Acquiring giant lock on
    clock interrupt handler
uint32_t clock_int5(uint32_t mask, struct trap_frame *tf)
...
       if (tf->ipl < IPL_CLOCK) {
#ifdef MULTIPROCESSOR
                __mp_lock(&kernel_lock);
#endif
                while (ci->ci_pendingticks) {
                        clk_count.ec_count++;
                        hardclock(tf);
                        ci->ci_pendingticks--;
                }
#ifdef MULTIPROCESSOR
                __mp_unlock(&kernel_lock);
#endif

       Actually, it causes a bug... described later
Hardware related
       problems

• Spin up secondary processor
• Keeping TLB consistency by software
• Cache coherency
Spin up secondary
            processor
•   We need to launch secondary processor
    •   Access hardware register on controller device
        to power on the processor
•   Secondary processor needs own bootstrap code
    •   Has to be similar to primary processor’s one
    •   But we won’t need kernel, memory, and device
        initialization
    •   Because primary processor already did it
Keeping TLB consistency
      by software
• MIPS TLB doesn’t have mechanism to keep
  consistency, we need to do it by software
  Just like the other typical architectures
• Invalidate/update other processor’s TLB
  using TLB shootdown
 • It implemented by IPI
    (Inter-Processor Interrupt)
Cache coherency

• MIPS R10000/R12000 processors have full
  cache coherency
• We don’t have to care about it on Octane
• But, some other processors haven’t full
  cache coherency
Ideas implementing
           SMP
• We have faced a number of issues while
  implementing SMP
 • Fighting against deadlock
 • Dynamic memory allocation without
    using virtual address
 • Reduce frequency of TLB shootdown
Fighting against
          deadlock
• It’s hard to find the cause of deadlocks
  because both processors runs concurrently
• It causes timing bugs -
  conditions are depend on timing
• Need to be able to determine what
  happened on both processors at that time
• There are tools for debugging it
JTAG ICE
• Very useful for debugging
• We can get any data for debugging on
  desired timing, even after kernel hangs
• I used it when I was implementing SMP for
  the embedded OS
• Not for Octane, there’s no way to connect
ddb
•   OpenBSD kernel has in-kernel debugger, named ddb
•   We can get similar data for JTAG ICE, but kernel
    need to alive - because it’s part of the kernel
•   Missing features:
    We hadn’t implemented “machine ddbcpu<#>” -
    which is processor switching function on ddb
    Without this, we can only debug one processor
    which invoked ddb on a breakpoint
•   Not always useful
printf()
•   Most popular kernel debugging tool
•   Just write printf(message) on your code, Easy to use ;)
•   Unfortunately, it has some problems
    •   printf() wastes lot of cycles, changes timing between
        processors
        We may miss timing bug because of it
    •   Some point of the code are printf() unsafe
        causes kernel hang
•   We use it anyway
Divide printf output for
    two serial port
•   There’s two serial port and two processors
•   If we have lots of debug print, it’s hard to
    understand which lines belongs to what processor
•   I implemented dirty hack code which output
    strings directly to secondary serial port, named
    combprintf()
•   Rewrite debug print to
    primary processor outputs primary serial port,
    secondary processor outputs secondary serial port
What do we need to
 print for debugging?
• To know where the kernel running roughly
 • put debug print everywhere the point
    kernel may running through
 • dump all system call using
    SYSCALL_DEBUG
• How can we determine a deadlock point?
Determine a deadlock
       point
• Deadlocks are occurring on spinlocks
• It loops permanently until a condition
  become available, but that condition never
  comes up
• At least to know which lock primitives we
  stuck on, we need to stop permanent loop
  by implementing timeout counter and print
  debug message
Adding timeout
        counter into mutex
void mtx_enter(struct mutex *mtx)
...
        for (;;) {
                if (mtx->mtx_wantipl != IPL_NONE)
                        s = splraise(mtx->mtx_wantipl);
                if (try_lock(mtx)) {
                        if (mtx->mtx_wantipl != IPL_NONE)
                                mtx->mtx_oldipl = s;
                        mtx->mtx_owner = curcpu();
                        return;
                }
                if (mtx->mtx_wantipl != IPL_NONE)
                        splx(s);
                if (++i > MTX_TIMEOUT)
                        panic("mtx deadlockedn”);
        }
Adding timeout
        counter into mutex
void mtx_enter(struct mutex *mtx)
...
        for (;;) {
                if (mtx->mtx_wantipl != IPL_NONE)
                        s = splraise(mtx->mtx_wantipl);
                if (try_lock(mtx)) {
                        if (mtx->mtx_wantipl != IPL_NONE)
                                mtx->mtx_oldipl = s;
                        mtx->mtx_owner = curcpu();
                        return;
                }
                if (mtx->mtx_wantipl != IPL_NONE)
                        splx(s);
                if (++i > MTX_TIMEOUT)
                        panic("mtx deadlockedn”);
        }
                 But, this is not enough
Why it’s not enough
                 CPU A                 CPU B

     Acquires lock                        Acquires lock

                         Lock   Lock
                          A      B




Spins until released                     Spins until released
Why it’s not enough
               CPU A                 CPU B

     Acquires lock                      Acquires lock

                       Lock   Lock
                        A      B




Spins until released                   Spins until released
   We can break here
Why it’s not enough
               CPU A                 CPU B

     Acquires lock                      Acquires lock
                                       We wanna know which
                       Lock   Lock      function acquired it
                        A      B




Spins until released                    Spins until released
   We can break here
Remember who
               acquired it
void mtx_enter(struct mutex *mtx)
...
        for (;;) {
                if (mtx->mtx_wantipl != IPL_NONE)
                        s = splraise(mtx->mtx_wantipl);
                if (try_lock(mtx)) {
                        if (mtx->mtx_wantipl != IPL_NONE)
                                mtx->mtx_oldipl = s;
                        mtx->mtx_owner = curcpu();
                        mtx->mtx_ra =
                            __builtin_return_address(0);
                        return;
                }
                if (mtx->mtx_wantipl != IPL_NONE)
                        splx(s);
                if (++i > MTX_TIMEOUT)
                        panic("mtx deadlocked ra:%pn",
                                mtx->mtx_ra);
        }
Interrupt blocks IPI
           CPU A          CPU B
                                  Interrupt
       Fault
                             Disable interrupt
  Acquire lock
                   Lock
                             Wait until released
TLB shootdown                 Blocked
                   IPI
    Wait ACK
Interrupt blocks IPI
           CPU A          CPU B
                                  Interrupt
       Fault
                             Disable interrupt
  Acquire lock
                   Lock
                             Wait until released
TLB shootdown
                   IPI
    Wait ACK
Interrupt blocks IPI
           CPU A          CPU B
                                  Interrupt
       Fault
                             Disable interrupt
  Acquire lock               Re-enable interrupt
                   Lock
                             Wait until released
TLB shootdown                Accept Interrupt
                   IPI
    Wait ACK
Interrupt blocks IPI
           CPU A          CPU B
                                  Interrupt
       Fault
                             Disable interrupt
  Acquire lock               Re-enable interrupt
                   Lock
                             Wait until released
TLB shootdown                Accept Interrupt
                   IPI
    Wait ACK                 ACK for rendezvous
Interrupt blocks IPI
             CPU A          CPU B
                                    Interrupt
        Fault
         Solution: Enable IPI interrupt on
                                  Disable interrupt
   Acquire lock interrupt handlers
                                  Re-enable interrupt
                      Lock
                                  Wait until released
TLB shootdown                     Accept Interrupt
                      IPI
     Wait ACK                    ACK for rendezvous
Fixing clock interrupt
                handler
uint32_t clock_int5(uint32_t mask, struct trap_frame *tf)
...
       if (tf->ipl < IPL_CLOCK) {
#ifdef MULTIPROCESSOR
                u_int32_t sr;

                sr = getsr();
                ENABLEIPI();
                __mp_lock(&kernel_lock);
#endif
                while (ci->ci_pendingticks) {
                        clk_count.ec_count++;
                        hardclock(tf);
                        ci->ci_pendingticks--;
                }
#ifdef MULTIPROCESSOR
                __mp_unlock(&kernel_lock);
                setsr(sr);
#endif
splhigh() blocks IPI
           CPU A          CPU B
                                  splhigh()
       Fault
  Acquire lock
                   Lock
                             Wait until released
TLB shootdown                 Blocked
                   IPI
    Wait ACK
splhigh() blocks IPI
           CPU A          CPU B
                                  splhigh()
       Fault
  Acquire lock
                   Lock
                             Wait until released
TLB shootdown
                   IPI
    Wait ACK
splhigh() blocks IPI
           CPU A          CPU B
                                  splhigh()
       Fault
  Acquire lock
                   Lock
                             Wait until released
TLB shootdown                Accept Interrupt
                   IPI
    Wait ACK
splhigh() blocks IPI
           CPU A          CPU B
                                  splhigh()
       Fault
  Acquire lock
                   Lock
                             Wait until released
TLB shootdown                Accept Interrupt
                   IPI
    Wait ACK                 ACK for rendezvous
splhigh() blocks IPI
           CPU A            CPU B
                                  splhigh()
    Solution: defined new interrupt priority
       Fault
     level named IPL_IPI to be higher than
  Acquire lock    IPL_HIGH
                     Lock
                               Wait until released
TLB shootdown                  Accept Interrupt
                     IPI
    Wait ACK                   ACK for rendezvous
Adding new interrupt
            priority level
 #define          IPL_TTY           4         /*   terminal */
 #define          IPL_VM            5         /*   memory allocation */
 #define          IPL_CLOCK         6         /*   clock */
 #define          IPL_HIGH          7         /*   everything */
 #define          NIPLS             8         /*   Number of levels */



 #define          IPL_TTY           4         /*   terminal */
 #define          IPL_VM            5         /*   memory allocation */
 #define          IPL_CLOCK         6         /*   clock */
 #define          IPL_HIGH          7         /*   everything */
 #define          IPL_IPI           8         /*   ipi */
 #define          NIPLS             9         /*   Number of levels */
Dynamic memory allocation
without using virtual address
•   To support N(>2) processors, the cpu_info structure and the
    bootstrap kernel stack for secondary processors should be
    allocated dynamically
•   But we can’t use virtual address for them
    •   stack may used before TLB initialization, thus causing the
        processor fault
    •   MIPS has Software TLB, need to maintain TLB by software
        TLB miss handler is the code to handle it
        This handler refers cpu_info, it cause TLB miss loop
•   To avoid these problems, we implemented wrapper function to
    allocate memory dynamically, then get the physical address and
    return it
The wrapper function
vaddr_t smp_malloc(size_t size)
...
       if (size < PAGE_SIZE) {
                va = (vaddr_t)malloc(size, M_DEVBUF, M_NOWAIT);
                if (va == NULL)
                        return NULL;
                error = pmap_extract(pmap_kernel(), va, &pa);
                if (error == FALSE)
                        return NULL;
       } else {
                TAILQ_INIT(&mlist);
                error = uvm_pglistalloc(size, 0, -1L, 0, 0,
                    &mlist, 1, UVM_PLA_NOWAIT);
                if (error)
                        return NULL;
                m = TAILQ_FIRST(&mlist);
                pa = VM_PAGE_TO_PHYS(m);
       }

       return PHYS_TO_XKPHYS(pa, CCA_CACHED);
Reduce frequency of
      TLB shootdown
•   There’s a condition we can skip TLB shootdown in invalidate/
    update:
                         using the pagetable    not using
          kernel mode           need              need
           user mode            need           don’t need

•   In user mode, if shootee processor doesn’t using the pagetable,
    we won’t need shootdown; just changing ASID assign is enough
•   In reference pmap implementation, a TLB shootdown
    performed non-conditionally, even in case it isn’t really needed
•   We added the condition to reduce frequency of it
void pmap_invalidate_page(pmap_t pmap, vm_offset_t va)
...
   arg.pmap = pmap;
   arg.va = va;
   smp_rendezvous(0, pmap_invalidate_page_action, 0,
    (void *)&arg);

void pmap_invalidate_page_action(void *arg)
...
   pmap_t pmap = ((struct pmap_invalidate_page_arg *)arg)->pmap;
   vm_offset_t va = ((struct pmap_invalidate_page_arg *)arg)->va;
   if (is_kernel_pmap(pmap)) {
      pmap_TLB_invalidate_kernel(va);
      return;
   }
   if (pmap->pm_asid[PCPU_GET(cpuid)].gen
       != PCPU_GET(asid_generation))
      return;
   else if (!(pmap->pm_active & PCPU_GET(cpumask))) {
      pmap->pm_asid[PCPU_GET(cpuid)].gen = 0;
      return;
   }
   va = pmap_va_asid(pmap, (va & ~PGOFSET));
   mips_TBIS(va);
CPU_INFO_FOREACH(cii, ci)
        if (cpuset_isset(&cpus_running, ci)) {
                unsigned int i = ci->ci_cpuid;
                unsigned int m = 1 << i;
                if (pmap->pm_asid[i].pma_asidgen !=
                    pmap_asid_info[i].pma_asidgen)
                        continue;
                else if (ci->ci_curpmap != pmap) {
                        pmap->pm_asid[i].pma_asidgen = 0;
                        continue;
                }
                cpumask |= m;
        }

if (cpumask == 1 << cpuid) {
        u_long asid;

        asid = pmap->pm_asid[cpuid].pma_asid << VMTLB_PID_SHIFT;
        tlb_flush_addr(va | asid);
} else if (cpumask) {
        struct pmap_invalidate_page_arg arg;
        arg.pmap = pmap;
        arg.va = va;

        smp_rendezvous_cpus(cpumask, pmap_invalidate_user_page_action,
                &arg);
}
Future works

• Commit machine ddbcpu<#>
• New port for Cavium OCTEON,
  if it’s acceptable for OpenBSD project
• Maybe SMP support for SGI Origin 350
• Also interested in MI part of SMP

Contenu connexe

Tendances

Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and InsightsGlobalLogic Ukraine
 
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...Anne Nicolas
 
Kernel Recipes 2015: Greybus
Kernel Recipes 2015: GreybusKernel Recipes 2015: Greybus
Kernel Recipes 2015: GreybusAnne Nicolas
 
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015curryon
 
From printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingFrom printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingThe Linux Foundation
 
Kernel Recipes 2017 - Build farm again - Willy Tarreau
Kernel Recipes 2017 - Build farm again - Willy TarreauKernel Recipes 2017 - Build farm again - Willy Tarreau
Kernel Recipes 2017 - Build farm again - Willy TarreauAnne Nicolas
 
Kernel Recipes 2015: Speed up your kernel development cycle with QEMU
Kernel Recipes 2015: Speed up your kernel development cycle with QEMUKernel Recipes 2015: Speed up your kernel development cycle with QEMU
Kernel Recipes 2015: Speed up your kernel development cycle with QEMUAnne Nicolas
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelSUSE Labs Taipei
 
NetBSD and Linux for Embedded Systems
NetBSD and Linux for Embedded SystemsNetBSD and Linux for Embedded Systems
NetBSD and Linux for Embedded SystemsMahendra M
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
LAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel AwarenessLAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel AwarenessLinaro
 
Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverAnne Nicolas
 
Compromising Linux Virtual Machines with Debugging Mechanisms
Compromising Linux Virtual Machines with Debugging MechanismsCompromising Linux Virtual Machines with Debugging Mechanisms
Compromising Linux Virtual Machines with Debugging MechanismsRussell Sanford
 
最後の楽園の開発をちょこっとだけ手伝った話
最後の楽園の開発をちょこっとだけ手伝った話最後の楽園の開発をちょこっとだけ手伝った話
最後の楽園の開発をちょこっとだけ手伝った話nullnilaki
 
Kernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesKernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesAnne Nicolas
 
PyCon India 2011: Python Threads: Dive into GIL!
PyCon India 2011: Python Threads: Dive into GIL!PyCon India 2011: Python Threads: Dive into GIL!
PyCon India 2011: Python Threads: Dive into GIL!Chetan Giridhar
 
Kernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureKernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureAnne Nicolas
 
Kernel Recipes 2015 - So you want to write a Linux driver framework
Kernel Recipes 2015 - So you want to write a Linux driver frameworkKernel Recipes 2015 - So you want to write a Linux driver framework
Kernel Recipes 2015 - So you want to write a Linux driver frameworkAnne Nicolas
 

Tendances (20)

Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and Insights
 
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
 
Kernel Recipes 2015: Greybus
Kernel Recipes 2015: GreybusKernel Recipes 2015: Greybus
Kernel Recipes 2015: Greybus
 
Linux Kernel Debugging
Linux Kernel DebuggingLinux Kernel Debugging
Linux Kernel Debugging
 
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
 
From printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingFrom printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debugging
 
Emulating With JavaScript
Emulating With JavaScriptEmulating With JavaScript
Emulating With JavaScript
 
Kernel Recipes 2017 - Build farm again - Willy Tarreau
Kernel Recipes 2017 - Build farm again - Willy TarreauKernel Recipes 2017 - Build farm again - Willy Tarreau
Kernel Recipes 2017 - Build farm again - Willy Tarreau
 
Kernel Recipes 2015: Speed up your kernel development cycle with QEMU
Kernel Recipes 2015: Speed up your kernel development cycle with QEMUKernel Recipes 2015: Speed up your kernel development cycle with QEMU
Kernel Recipes 2015: Speed up your kernel development cycle with QEMU
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux Kernel
 
NetBSD and Linux for Embedded Systems
NetBSD and Linux for Embedded SystemsNetBSD and Linux for Embedded Systems
NetBSD and Linux for Embedded Systems
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
LAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel AwarenessLAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel Awareness
 
Kernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driverKernel Recipes 2015: Anatomy of an atomic KMS driver
Kernel Recipes 2015: Anatomy of an atomic KMS driver
 
Compromising Linux Virtual Machines with Debugging Mechanisms
Compromising Linux Virtual Machines with Debugging MechanismsCompromising Linux Virtual Machines with Debugging Mechanisms
Compromising Linux Virtual Machines with Debugging Mechanisms
 
最後の楽園の開発をちょこっとだけ手伝った話
最後の楽園の開発をちょこっとだけ手伝った話最後の楽園の開発をちょこっとだけ手伝った話
最後の楽園の開発をちょこっとだけ手伝った話
 
Kernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologiesKernel Recipes 2015: Kernel packet capture technologies
Kernel Recipes 2015: Kernel packet capture technologies
 
PyCon India 2011: Python Threads: Dive into GIL!
PyCon India 2011: Python Threads: Dive into GIL!PyCon India 2011: Python Threads: Dive into GIL!
PyCon India 2011: Python Threads: Dive into GIL!
 
Kernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureKernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architecture
 
Kernel Recipes 2015 - So you want to write a Linux driver framework
Kernel Recipes 2015 - So you want to write a Linux driver frameworkKernel Recipes 2015 - So you want to write a Linux driver framework
Kernel Recipes 2015 - So you want to write a Linux driver framework
 

En vedette

Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)shimosawa
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)shimosawa
 
Edukacja ekologiczna w Poznaniu
Edukacja ekologiczna w PoznaniuEdukacja ekologiczna w Poznaniu
Edukacja ekologiczna w PoznaniuEkokonsultacje
 
Arapski brojevi
Arapski brojeviArapski brojevi
Arapski brojeviverka 123
 
Bridge Detailed Construction
Bridge Detailed ConstructionBridge Detailed Construction
Bridge Detailed ConstructionAlan Bassett
 
R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...
R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...
R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...Randolph Preisinger-Kleine
 
Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...
Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...
Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...Alan Bassett
 
Revitalize Your Career And Job Search
Revitalize Your Career And Job SearchRevitalize Your Career And Job Search
Revitalize Your Career And Job SearchKatDiDio
 
Leon Chartarifsky
Leon ChartarifskyLeon Chartarifsky
Leon Chartarifskyyaelsky
 
Red stripe 026a 16 lpb
Red stripe 026a 16 lpbRed stripe 026a 16 lpb
Red stripe 026a 16 lpbAlan Bassett
 
Christening
ChristeningChristening
Christeningthackley
 
Maya research program field school photo album
Maya research program field school photo albumMaya research program field school photo album
Maya research program field school photo albumMaya Research Program
 
Design innovation: 10 ways to improve the learner experience
Design innovation: 10 ways to improve the learner experienceDesign innovation: 10 ways to improve the learner experience
Design innovation: 10 ways to improve the learner experienceBrightwave Group
 

En vedette (20)

Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Edukacja ekologiczna w Poznaniu
Edukacja ekologiczna w PoznaniuEdukacja ekologiczna w Poznaniu
Edukacja ekologiczna w Poznaniu
 
Prezentare_Paylex
Prezentare_PaylexPrezentare_Paylex
Prezentare_Paylex
 
Arapski brojevi
Arapski brojeviArapski brojevi
Arapski brojevi
 
Bridge Detailed Construction
Bridge Detailed ConstructionBridge Detailed Construction
Bridge Detailed Construction
 
R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...
R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...
R3L+ project material: Reviving Limerick CDB City of Learning Strategy, Janua...
 
Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...
Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...
Hse alert 2013 35 two fatalities as a result of a failure of a bonnet-to...
 
Kelly ruggles 2
Kelly ruggles 2Kelly ruggles 2
Kelly ruggles 2
 
Revitalize Your Career And Job Search
Revitalize Your Career And Job SearchRevitalize Your Career And Job Search
Revitalize Your Career And Job Search
 
Leon Chartarifsky
Leon ChartarifskyLeon Chartarifsky
Leon Chartarifsky
 
Internet Bookmobile Presentation
Internet Bookmobile PresentationInternet Bookmobile Presentation
Internet Bookmobile Presentation
 
Kelly C. Ruggles
Kelly C. RugglesKelly C. Ruggles
Kelly C. Ruggles
 
Red stripe 026a 16 lpb
Red stripe 026a 16 lpbRed stripe 026a 16 lpb
Red stripe 026a 16 lpb
 
Tastes Great, More Satisfying!
Tastes Great, More Satisfying!Tastes Great, More Satisfying!
Tastes Great, More Satisfying!
 
Christening
ChristeningChristening
Christening
 
Where god wants_me
Where god wants_meWhere god wants_me
Where god wants_me
 
La gestion del_riesgo_de_desastre_aplica
La gestion del_riesgo_de_desastre_aplicaLa gestion del_riesgo_de_desastre_aplica
La gestion del_riesgo_de_desastre_aplica
 
Maya research program field school photo album
Maya research program field school photo albumMaya research program field school photo album
Maya research program field school photo album
 
Design innovation: 10 ways to improve the learner experience
Design innovation: 10 ways to improve the learner experienceDesign innovation: 10 ways to improve the learner experience
Design innovation: 10 ways to improve the learner experience
 

Similaire à SMP implementation for OpenBSD/sgi

SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]Takuya ASADA
 
Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!Peter Hlavaty
 
JVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixJVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixCodemotion Tel Aviv
 
Solnik secure enclaveprocessor-pacsec
Solnik secure enclaveprocessor-pacsecSolnik secure enclaveprocessor-pacsec
Solnik secure enclaveprocessor-pacsecPacSecJP
 
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel" You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel" Peter Hlavaty
 
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...CanSecWest
 
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit Kits
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit KitsStealing from Thieves: Breaking IonCUBE VM to RE Exploit Kits
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit KitsМохачёк Сахер
 
Summer training embedded system and its scope
Summer training  embedded system and its scopeSummer training  embedded system and its scope
Summer training embedded system and its scopeArshit Rai
 
Advanced SOHO Router Exploitation XCON
Advanced SOHO Router Exploitation XCONAdvanced SOHO Router Exploitation XCON
Advanced SOHO Router Exploitation XCONLyon Yang
 
Bh ad-12-stealing-from-thieves-saher-slides
Bh ad-12-stealing-from-thieves-saher-slidesBh ad-12-stealing-from-thieves-saher-slides
Bh ad-12-stealing-from-thieves-saher-slidesMatt Kocubinski
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...Alexandre Moneger
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilersAnastasiaStulova
 
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon YangPractical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon YangLyon Yang
 
Summer training embedded system and its scope
Summer training  embedded system and its scopeSummer training  embedded system and its scope
Summer training embedded system and its scopeArshit Rai
 
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1Adam Dunkels
 

Similaire à SMP implementation for OpenBSD/sgi (20)

SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]
 
PILOT Session for Embedded Systems
PILOT Session for Embedded Systems PILOT Session for Embedded Systems
PILOT Session for Embedded Systems
 
Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!
 
JVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, WixJVM Memory Model - Yoav Abrahami, Wix
JVM Memory Model - Yoav Abrahami, Wix
 
Solnik secure enclaveprocessor-pacsec
Solnik secure enclaveprocessor-pacsecSolnik secure enclaveprocessor-pacsec
Solnik secure enclaveprocessor-pacsec
 
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel" You didnt see it’s coming? "Dawn of hardened Windows Kernel"
You didnt see it’s coming? "Dawn of hardened Windows Kernel"
 
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
CSW2017Richard Johnson_harnessing intel processor trace on windows for vulner...
 
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit Kits
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit KitsStealing from Thieves: Breaking IonCUBE VM to RE Exploit Kits
Stealing from Thieves: Breaking IonCUBE VM to RE Exploit Kits
 
Jvm memory model
Jvm memory modelJvm memory model
Jvm memory model
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Summer training embedded system and its scope
Summer training  embedded system and its scopeSummer training  embedded system and its scope
Summer training embedded system and its scope
 
Advanced SOHO Router Exploitation XCON
Advanced SOHO Router Exploitation XCONAdvanced SOHO Router Exploitation XCON
Advanced SOHO Router Exploitation XCON
 
Bh ad-12-stealing-from-thieves-saher-slides
Bh ad-12-stealing-from-thieves-saher-slidesBh ad-12-stealing-from-thieves-saher-slides
Bh ad-12-stealing-from-thieves-saher-slides
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
 
Programar para GPUs
Programar para GPUsProgramar para GPUs
Programar para GPUs
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilers
 
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon YangPractical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
Practical IoT Exploitation (DEFCON23 IoTVillage) - Lyon Yang
 
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
 
Summer training embedded system and its scope
Summer training  embedded system and its scopeSummer training  embedded system and its scope
Summer training embedded system and its scope
 
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
Building the Internet of Things with Thingsquare and Contiki - day 2 part 1
 

Plus de Takuya ASADA

Seastar in 歌舞伎座.tech#8「C++初心者会」
Seastar in 歌舞伎座.tech#8「C++初心者会」Seastar in 歌舞伎座.tech#8「C++初心者会」
Seastar in 歌舞伎座.tech#8「C++初心者会」Takuya ASADA
 
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワークSeastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワークTakuya ASADA
 
高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」
高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」
高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」Takuya ASADA
 
ヤマノススメ〜秋山郷 de ハッカソン〜
ヤマノススメ〜秋山郷 de ハッカソン〜ヤマノススメ〜秋山郷 de ハッカソン〜
ヤマノススメ〜秋山郷 de ハッカソン〜Takuya ASADA
 
UEFI時代のブートローダ
UEFI時代のブートローダUEFI時代のブートローダ
UEFI時代のブートローダTakuya ASADA
 
OSvのご紹介 in 
Java 8 HotSpot meeting
OSvのご紹介 in 
Java 8 HotSpot meetingOSvのご紹介 in 
Java 8 HotSpot meeting
OSvのご紹介 in 
Java 8 HotSpot meetingTakuya ASADA
 
OSvパンフレット v3
OSvパンフレット v3OSvパンフレット v3
OSvパンフレット v3Takuya ASADA
 
OSvのご紹介 in OSC2014 Tokyo/Fall
OSvのご紹介 in OSC2014 Tokyo/FallOSvのご紹介 in OSC2014 Tokyo/Fall
OSvのご紹介 in OSC2014 Tokyo/FallTakuya ASADA
 
OSvの概要と実装
OSvの概要と実装OSvの概要と実装
OSvの概要と実装Takuya ASADA
 
Linux network stack
Linux network stackLinux network stack
Linux network stackTakuya ASADA
 
Ethernetの受信処理
Ethernetの受信処理Ethernetの受信処理
Ethernetの受信処理Takuya ASADA
 
Presentation on your terminal
Presentation on your terminalPresentation on your terminal
Presentation on your terminalTakuya ASADA
 
僕のIntel nucが起動しないわけがない
僕のIntel nucが起動しないわけがない僕のIntel nucが起動しないわけがない
僕のIntel nucが起動しないわけがないTakuya ASADA
 
Interrupt Affinityについて
Interrupt AffinityについてInterrupt Affinityについて
Interrupt AffinityについてTakuya ASADA
 
OSvパンフレット
OSvパンフレットOSvパンフレット
OSvパンフレットTakuya ASADA
 
BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜
BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜
BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜Takuya ASADA
 
「ハイパーバイザの作り方」読書会#2
「ハイパーバイザの作り方」読書会#2「ハイパーバイザの作り方」読書会#2
「ハイパーバイザの作り方」読書会#2Takuya ASADA
 
「ハイパーバイザの作り方」読書会#1
「ハイパーバイザの作り方」読書会#1「ハイパーバイザの作り方」読書会#1
「ハイパーバイザの作り方」読書会#1Takuya ASADA
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化Takuya ASADA
 

Plus de Takuya ASADA (20)

Seastar in 歌舞伎座.tech#8「C++初心者会」
Seastar in 歌舞伎座.tech#8「C++初心者会」Seastar in 歌舞伎座.tech#8「C++初心者会」
Seastar in 歌舞伎座.tech#8「C++初心者会」
 
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワークSeastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
 
高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」
高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」
高スループットなサーバアプリケーションの為の新しいフレームワーク
「Seastar」
 
ヤマノススメ〜秋山郷 de ハッカソン〜
ヤマノススメ〜秋山郷 de ハッカソン〜ヤマノススメ〜秋山郷 de ハッカソン〜
ヤマノススメ〜秋山郷 de ハッカソン〜
 
UEFI時代のブートローダ
UEFI時代のブートローダUEFI時代のブートローダ
UEFI時代のブートローダ
 
OSvのご紹介 in 
Java 8 HotSpot meeting
OSvのご紹介 in 
Java 8 HotSpot meetingOSvのご紹介 in 
Java 8 HotSpot meeting
OSvのご紹介 in 
Java 8 HotSpot meeting
 
OSvパンフレット v3
OSvパンフレット v3OSvパンフレット v3
OSvパンフレット v3
 
OSvのご紹介 in OSC2014 Tokyo/Fall
OSvのご紹介 in OSC2014 Tokyo/FallOSvのご紹介 in OSC2014 Tokyo/Fall
OSvのご紹介 in OSC2014 Tokyo/Fall
 
OSv噺
OSv噺OSv噺
OSv噺
 
OSvの概要と実装
OSvの概要と実装OSvの概要と実装
OSvの概要と実装
 
Linux network stack
Linux network stackLinux network stack
Linux network stack
 
Ethernetの受信処理
Ethernetの受信処理Ethernetの受信処理
Ethernetの受信処理
 
Presentation on your terminal
Presentation on your terminalPresentation on your terminal
Presentation on your terminal
 
僕のIntel nucが起動しないわけがない
僕のIntel nucが起動しないわけがない僕のIntel nucが起動しないわけがない
僕のIntel nucが起動しないわけがない
 
Interrupt Affinityについて
Interrupt AffinityについてInterrupt Affinityについて
Interrupt Affinityについて
 
OSvパンフレット
OSvパンフレットOSvパンフレット
OSvパンフレット
 
BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜
BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜
BHyVeでOSvを起動したい
〜BIOSがなくてもこの先生きのこるには〜
 
「ハイパーバイザの作り方」読書会#2
「ハイパーバイザの作り方」読書会#2「ハイパーバイザの作り方」読書会#2
「ハイパーバイザの作り方」読書会#2
 
「ハイパーバイザの作り方」読書会#1
「ハイパーバイザの作り方」読書会#1「ハイパーバイザの作り方」読書会#1
「ハイパーバイザの作り方」読書会#1
 
10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化10GbE時代のネットワークI/O高速化
10GbE時代のネットワークI/O高速化
 

Dernier

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 

Dernier (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 

SMP implementation for OpenBSD/sgi

  • 1. SMP Implementation for OpenBSD/sgi Takuya ASADA<syuu@openbsd.org>
  • 2. Introduction • I was working to add SMP & 64bit support to a BSD-based embedded OS at The target device was MIPS64 • There’s no complete *BSD/MIPS SMP implementation at that time, so I implemented it • The implementation was proprietary, but I wanted to contribute something to BSD community • I decided to implement SMP from scratch, tried to find a suitable MIPS SMP machine
  • 3. Finding MIPS/SMP Machine(1) • Broadcom SiByte BCM1250 looks nice - 2core 1GHz MIPS64, DDR DRAM, GbE, PCI, HT • Cisco 3845 integrated this processor • How much is it?
  • 4. Finding MIPS/SMP Machine(1) • Broadcom SiByte BCM1250 looks nice - 2core 1GHz MIPS64, DDR DRAM, GbE, PCI, HT • Cisco 3845 integrated this processor • How much is it? from $14,200! Totally unacceptable!!!
  • 5. Finding MIPS/SMP Machine(2) • Some antique SGI machines are available on eBay • These are basically very cheap • I realized SGI Octane 2 has 2 cores, already supported by OpenBSD
  • 6. Finding MIPS/SMP Machine(2) • Some antique SGI machines are available on eBay • These are basically very cheap • I realized SGI Octane 2 has 2 cores, already supported by OpenBSD Just $33! Do you remember how much was it?
  • 7. This is my Octane2 Processors MIPS R12000 400MHz x2 Memory 1GB SDRAM Graphics 3D Graphics Card Sound Integrated Digital Audio Storage 35GB SCSI HDD Ethernet 100BASE-T
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Become an OpenBSD developer • I started working on Octane2 since Apr 2009, wrote about on my blog • Miod Vallat discovered it, suggested my code be merged into OpenBSD main tree • I become an OpenBSD developer in Sep 2009, started merging the code, joined hackathon, worked with Miod and Joel Sing
  • 13. NOW IT WORKS!!!! • Merged into OpenBSD main tree • You can try it now! • It seems stable - I tried “make build” again and again over a day, it keeps working
  • 14. Demo
  • 15. Performance - kernel build benchmark '(&!!" 4% overhead '(%!!" '($!!" '(#!!" 46% speed up '(!!!" ,-.-/01)02+!" &!!" ,-.-/01)02+!342" %!!" $!!" #!!" !" (sec) ")*'" ")*#" ")*+"
  • 16. Performance - OpenBSD/amd64 for comparison &!!" %#!" 2% overhead %!!" 44% speed up $#!" )*+*,-." )*+*,-./01" $!!" #!" !" "'($" "'(%" "'(&" (sec)
  • 17. What did we need to do for the SMP work There were lots of works... • Support multiple cpu_info • Secondary processor entry and processor related point macros • IPI: Inter-Processor Interrupt • Move per-processor data into cpu_info • Per-processor ASID management • Implement lock primitives • TLB shootdown • Acquiring giant lock • Lazy FPU handling • Implement atomic operations • Per-processor clock • Spin up secondary processors
  • 18. Describe it more simply We can classify tasks into three kind of problems: • Restructuring per-processor informations • Implement and use lock/atomic primitives • Hardware related problems
  • 19. Restructuring per- processor informations • In the original sgi port, some informations which related the processor are allocated only one • Simple case: Information is stored into global variable Move it into “cpu_info”, per-processor structure • Complex case: In pmap, we need to maintain some information per-processor * per-process
  • 20. Simple case - clock.c: original code // defined as global variables u_int32_t cpu_counter_last; u_int32_t cpu_counter_interval; u_int32_t pendingticks; uint32_t clock_int5(uint32_t mask, struct trap_frame *tf) ... clkdiff = cp0_get_count() - cpu_counter_last; while (clkdiff >= cpu_counter_interval) { cpu_counter_last += cpu_counter_interval; clkdiff = cp0_get_count() - cpu_counter_last; pendingticks++;
  • 21. Simple case - clock.c: modified code uint32_t clock_int5(uint32_t mask, struct trap_frame *tf) ... struct cpu_info *ci = curcpu(); clkdiff = cp0_get_count() - ci->ci_cpu_counter_last; while (clkdiff >= ci->ci_cpu_counter_interval) { ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval; clkdiff = cp0_get_count() - ci->ci_cpu_counter_last; ci->ci_pendingticks++;
  • 22. Complex case: pmap • MIPS TLB entries are tagged with 8bit process id called ASID, used for improve performance MMU skips different process TLB entries on lookup We won’t need to flush TLB every context switches • Need to maintain process:ASID assign information because it’s smaller than PID, we need to rotate it • The information should keep beyond context switch • We maintain ASID individually per-processor • What we need is: ASID information per-processor * per-process
  • 23. Complex case - pmap.c: original and modified code uint pmap_alloc_tlbpid(struct proc *p) ... tlbpid_cnt = id + 1; pmap->pm_tlbpid = id; uint pmap_alloc_tlbpid(struct proc *p) ... tlbpid_cnt[cpuid] = id + 1; pmap->pm_tlbpid[cpuid] = id;
  • 24. Implement and use lock/atomic primitives • Needed to implement • lock primitives: mutex, mp_lock • atomic primitives: CAS, 64bit add, etc.. • Acquiring giant lock prior to entering the kernel context • hardware interrupts • software interrupts • trap()
  • 25. Acquiring giant lock on clock interrupt handler uint32_t clock_int5(uint32_t mask, struct trap_frame *tf) ... if (tf->ipl < IPL_CLOCK) { #ifdef MULTIPROCESSOR __mp_lock(&kernel_lock); #endif while (ci->ci_pendingticks) { clk_count.ec_count++; hardclock(tf); ci->ci_pendingticks--; } #ifdef MULTIPROCESSOR __mp_unlock(&kernel_lock); #endif
  • 26. Acquiring giant lock on clock interrupt handler uint32_t clock_int5(uint32_t mask, struct trap_frame *tf) ... if (tf->ipl < IPL_CLOCK) { #ifdef MULTIPROCESSOR __mp_lock(&kernel_lock); #endif while (ci->ci_pendingticks) { clk_count.ec_count++; hardclock(tf); ci->ci_pendingticks--; } #ifdef MULTIPROCESSOR __mp_unlock(&kernel_lock); #endif Actually, it causes a bug... described later
  • 27. Hardware related problems • Spin up secondary processor • Keeping TLB consistency by software • Cache coherency
  • 28. Spin up secondary processor • We need to launch secondary processor • Access hardware register on controller device to power on the processor • Secondary processor needs own bootstrap code • Has to be similar to primary processor’s one • But we won’t need kernel, memory, and device initialization • Because primary processor already did it
  • 29. Keeping TLB consistency by software • MIPS TLB doesn’t have mechanism to keep consistency, we need to do it by software Just like the other typical architectures • Invalidate/update other processor’s TLB using TLB shootdown • It implemented by IPI (Inter-Processor Interrupt)
  • 30. Cache coherency • MIPS R10000/R12000 processors have full cache coherency • We don’t have to care about it on Octane • But, some other processors haven’t full cache coherency
  • 31. Ideas implementing SMP • We have faced a number of issues while implementing SMP • Fighting against deadlock • Dynamic memory allocation without using virtual address • Reduce frequency of TLB shootdown
  • 32. Fighting against deadlock • It’s hard to find the cause of deadlocks because both processors runs concurrently • It causes timing bugs - conditions are depend on timing • Need to be able to determine what happened on both processors at that time • There are tools for debugging it
  • 33. JTAG ICE • Very useful for debugging • We can get any data for debugging on desired timing, even after kernel hangs • I used it when I was implementing SMP for the embedded OS • Not for Octane, there’s no way to connect
  • 34. ddb • OpenBSD kernel has in-kernel debugger, named ddb • We can get similar data for JTAG ICE, but kernel need to alive - because it’s part of the kernel • Missing features: We hadn’t implemented “machine ddbcpu<#>” - which is processor switching function on ddb Without this, we can only debug one processor which invoked ddb on a breakpoint • Not always useful
  • 35. printf() • Most popular kernel debugging tool • Just write printf(message) on your code, Easy to use ;) • Unfortunately, it has some problems • printf() wastes lot of cycles, changes timing between processors We may miss timing bug because of it • Some point of the code are printf() unsafe causes kernel hang • We use it anyway
  • 36. Divide printf output for two serial port • There’s two serial port and two processors • If we have lots of debug print, it’s hard to understand which lines belongs to what processor • I implemented dirty hack code which output strings directly to secondary serial port, named combprintf() • Rewrite debug print to primary processor outputs primary serial port, secondary processor outputs secondary serial port
  • 37. What do we need to print for debugging? • To know where the kernel running roughly • put debug print everywhere the point kernel may running through • dump all system call using SYSCALL_DEBUG • How can we determine a deadlock point?
  • 38. Determine a deadlock point • Deadlocks are occurring on spinlocks • It loops permanently until a condition become available, but that condition never comes up • At least to know which lock primitives we stuck on, we need to stop permanent loop by implementing timeout counter and print debug message
  • 39. Adding timeout counter into mutex void mtx_enter(struct mutex *mtx) ...         for (;;) {                 if (mtx->mtx_wantipl != IPL_NONE)                         s = splraise(mtx->mtx_wantipl);                 if (try_lock(mtx)) {                         if (mtx->mtx_wantipl != IPL_NONE)                                 mtx->mtx_oldipl = s;                         mtx->mtx_owner = curcpu();                         return;                 }                 if (mtx->mtx_wantipl != IPL_NONE)                         splx(s);                 if (++i > MTX_TIMEOUT)                         panic("mtx deadlockedn”);         }
  • 40. Adding timeout counter into mutex void mtx_enter(struct mutex *mtx) ...         for (;;) {                 if (mtx->mtx_wantipl != IPL_NONE)                         s = splraise(mtx->mtx_wantipl);                 if (try_lock(mtx)) {                         if (mtx->mtx_wantipl != IPL_NONE)                                 mtx->mtx_oldipl = s;                         mtx->mtx_owner = curcpu();                         return;                 }                 if (mtx->mtx_wantipl != IPL_NONE)                         splx(s);                 if (++i > MTX_TIMEOUT)                         panic("mtx deadlockedn”);         } But, this is not enough
  • 41. Why it’s not enough CPU A CPU B Acquires lock Acquires lock Lock Lock A B Spins until released Spins until released
  • 42. Why it’s not enough CPU A CPU B Acquires lock Acquires lock Lock Lock A B Spins until released Spins until released We can break here
  • 43. Why it’s not enough CPU A CPU B Acquires lock Acquires lock We wanna know which Lock Lock function acquired it A B Spins until released Spins until released We can break here
  • 44. Remember who acquired it void mtx_enter(struct mutex *mtx) ...         for (;;) {                 if (mtx->mtx_wantipl != IPL_NONE)                         s = splraise(mtx->mtx_wantipl);                 if (try_lock(mtx)) {                         if (mtx->mtx_wantipl != IPL_NONE)                                 mtx->mtx_oldipl = s;                         mtx->mtx_owner = curcpu();                         mtx->mtx_ra =                          __builtin_return_address(0);                         return;                 }                 if (mtx->mtx_wantipl != IPL_NONE)                         splx(s);                 if (++i > MTX_TIMEOUT)                         panic("mtx deadlocked ra:%pn",                                 mtx->mtx_ra);         }
  • 45. Interrupt blocks IPI CPU A CPU B Interrupt Fault Disable interrupt Acquire lock Lock Wait until released TLB shootdown Blocked IPI Wait ACK
  • 46. Interrupt blocks IPI CPU A CPU B Interrupt Fault Disable interrupt Acquire lock Lock Wait until released TLB shootdown IPI Wait ACK
  • 47. Interrupt blocks IPI CPU A CPU B Interrupt Fault Disable interrupt Acquire lock Re-enable interrupt Lock Wait until released TLB shootdown Accept Interrupt IPI Wait ACK
  • 48. Interrupt blocks IPI CPU A CPU B Interrupt Fault Disable interrupt Acquire lock Re-enable interrupt Lock Wait until released TLB shootdown Accept Interrupt IPI Wait ACK ACK for rendezvous
  • 49. Interrupt blocks IPI CPU A CPU B Interrupt Fault Solution: Enable IPI interrupt on Disable interrupt Acquire lock interrupt handlers Re-enable interrupt Lock Wait until released TLB shootdown Accept Interrupt IPI Wait ACK ACK for rendezvous
  • 50. Fixing clock interrupt handler uint32_t clock_int5(uint32_t mask, struct trap_frame *tf) ... if (tf->ipl < IPL_CLOCK) { #ifdef MULTIPROCESSOR u_int32_t sr; sr = getsr(); ENABLEIPI(); __mp_lock(&kernel_lock); #endif while (ci->ci_pendingticks) { clk_count.ec_count++; hardclock(tf); ci->ci_pendingticks--; } #ifdef MULTIPROCESSOR __mp_unlock(&kernel_lock); setsr(sr); #endif
  • 51. splhigh() blocks IPI CPU A CPU B splhigh() Fault Acquire lock Lock Wait until released TLB shootdown Blocked IPI Wait ACK
  • 52. splhigh() blocks IPI CPU A CPU B splhigh() Fault Acquire lock Lock Wait until released TLB shootdown IPI Wait ACK
  • 53. splhigh() blocks IPI CPU A CPU B splhigh() Fault Acquire lock Lock Wait until released TLB shootdown Accept Interrupt IPI Wait ACK
  • 54. splhigh() blocks IPI CPU A CPU B splhigh() Fault Acquire lock Lock Wait until released TLB shootdown Accept Interrupt IPI Wait ACK ACK for rendezvous
  • 55. splhigh() blocks IPI CPU A CPU B splhigh() Solution: defined new interrupt priority Fault level named IPL_IPI to be higher than Acquire lock IPL_HIGH Lock Wait until released TLB shootdown Accept Interrupt IPI Wait ACK ACK for rendezvous
  • 56. Adding new interrupt priority level  #define        IPL_TTY         4       /* terminal */  #define        IPL_VM          5       /* memory allocation */  #define        IPL_CLOCK       6       /* clock */ #define        IPL_HIGH        7       /* everything */ #define        NIPLS           8       /* Number of levels */  #define        IPL_TTY         4       /* terminal */  #define        IPL_VM          5       /* memory allocation */  #define        IPL_CLOCK       6       /* clock */ #define        IPL_HIGH        7       /* everything */ #define        IPL_IPI         8       /* ipi */ #define        NIPLS           9       /* Number of levels */
  • 57. Dynamic memory allocation without using virtual address • To support N(>2) processors, the cpu_info structure and the bootstrap kernel stack for secondary processors should be allocated dynamically • But we can’t use virtual address for them • stack may used before TLB initialization, thus causing the processor fault • MIPS has Software TLB, need to maintain TLB by software TLB miss handler is the code to handle it This handler refers cpu_info, it cause TLB miss loop • To avoid these problems, we implemented wrapper function to allocate memory dynamically, then get the physical address and return it
  • 58. The wrapper function vaddr_t smp_malloc(size_t size) ... if (size < PAGE_SIZE) { va = (vaddr_t)malloc(size, M_DEVBUF, M_NOWAIT); if (va == NULL) return NULL; error = pmap_extract(pmap_kernel(), va, &pa); if (error == FALSE) return NULL; } else { TAILQ_INIT(&mlist); error = uvm_pglistalloc(size, 0, -1L, 0, 0, &mlist, 1, UVM_PLA_NOWAIT); if (error) return NULL; m = TAILQ_FIRST(&mlist); pa = VM_PAGE_TO_PHYS(m); } return PHYS_TO_XKPHYS(pa, CCA_CACHED);
  • 59. Reduce frequency of TLB shootdown • There’s a condition we can skip TLB shootdown in invalidate/ update: using the pagetable not using kernel mode need need user mode need don’t need • In user mode, if shootee processor doesn’t using the pagetable, we won’t need shootdown; just changing ASID assign is enough • In reference pmap implementation, a TLB shootdown performed non-conditionally, even in case it isn’t really needed • We added the condition to reduce frequency of it
  • 60. void pmap_invalidate_page(pmap_t pmap, vm_offset_t va) ... arg.pmap = pmap; arg.va = va; smp_rendezvous(0, pmap_invalidate_page_action, 0, (void *)&arg); void pmap_invalidate_page_action(void *arg) ... pmap_t pmap = ((struct pmap_invalidate_page_arg *)arg)->pmap; vm_offset_t va = ((struct pmap_invalidate_page_arg *)arg)->va; if (is_kernel_pmap(pmap)) { pmap_TLB_invalidate_kernel(va); return; } if (pmap->pm_asid[PCPU_GET(cpuid)].gen != PCPU_GET(asid_generation)) return; else if (!(pmap->pm_active & PCPU_GET(cpumask))) { pmap->pm_asid[PCPU_GET(cpuid)].gen = 0; return; } va = pmap_va_asid(pmap, (va & ~PGOFSET)); mips_TBIS(va);
  • 61. CPU_INFO_FOREACH(cii, ci) if (cpuset_isset(&cpus_running, ci)) { unsigned int i = ci->ci_cpuid; unsigned int m = 1 << i; if (pmap->pm_asid[i].pma_asidgen != pmap_asid_info[i].pma_asidgen) continue; else if (ci->ci_curpmap != pmap) { pmap->pm_asid[i].pma_asidgen = 0; continue; } cpumask |= m; } if (cpumask == 1 << cpuid) { u_long asid; asid = pmap->pm_asid[cpuid].pma_asid << VMTLB_PID_SHIFT; tlb_flush_addr(va | asid); } else if (cpumask) { struct pmap_invalidate_page_arg arg; arg.pmap = pmap; arg.va = va; smp_rendezvous_cpus(cpumask, pmap_invalidate_user_page_action, &arg); }
  • 62. Future works • Commit machine ddbcpu<#> • New port for Cavium OCTEON, if it’s acceptable for OpenBSD project • Maybe SMP support for SGI Origin 350 • Also interested in MI part of SMP

Notes de l'éditeur