SlideShare une entreprise Scribd logo
1  sur  65
Télécharger pour lire hors ligne
* Based on kernel 5.11 (x86_64) – QEMU
* 2-socket CPUs (2 cores/socket)
* 16GB memory
* Kernel parameter: nokaslr norandmaps
* KASAN: disabled
* Userspace: ASLR is disabled
* Legacy BIOS
Linux Synchronization Mechanism: Semaphore &
Mutex
Adrian Huang | Feb, 2023
Agenda
• Semaphore
✓producer-consumer problem
✓Implementation in Linux kernel
• Mutex (introduced in v2.6.16)
✓Enforce serialization on shared memory systems
✓Implementation in Linux kernel
✓Mutex lock
➢Fast path, midpath, slow path
✓Mutex unlock
➢Fast path and slow path
➢Mutex ownership (with a lab)
◼ Re-visit this concept: Only the lock owner has the permission to unlock the mutex
✓Q & A
Semaphore: producer-consumer problem
task 0
Semaphore
wakeup/signal
#0
#1
#2
#3
Share resource
counter = 4
wait_list
wait/sleep
P, down()
V, up()
task 1
task N
.
.
task task task
• Sleeping lock
• Used in process context *ONLY*
• Cannot hold a spin lock while acquiring a semaphore
• Mainly use in producer-consumer scenario
• The lock holder does not require to unlock the lock. (non-ownership concept)
✓ Something like notification
down
__down
raw_spin_lock_irqsave
sem->count > 0
semaphore
lock
count
wait_list semaphore_waiter
list
task
struct raw_spinlock or raw_spinlock_t
raw_lock
sem->count--
raw_spin_unlock_irqrestore
up
Semaphore Implementation in Linux Kernel
__down_common
Y
N
Semaphore Implementation in Linux Kernel
Semaphore Implementation in Linux Kernel
[Only for interruptible and wakekill task] Check if the sleeping task gets a signal
Semaphore Implementation in Linux Kernel
1 Protect sem->count data
2
Reschedule: Need to unlock spinlock
down
__down
raw_spin_lock_irqsave
sem->count > 0
sem->count--
raw_spin_unlock_irqrestore
__down_common
Y
N
Semaphore Implementation in Linux Kernel
Protect sem->count data
up
__up
raw_spin_lock_irqsave
wait_list empty?
sem->count++
raw_spin_unlock_irqrestore
Y
N
semaphore
lock
count
wait_list semaphore_waiter
list
task
struct raw_spinlock or raw_spinlock_t
raw_lock
up
Agenda
• Semaphore
✓producer-consumer problem
✓Implementation in Linux kernel
• Mutex (introduced in v2.6.16)
✓Enforce serialization on shared memory systems
✓Implementation in Linux kernel
✓Call path
➢Fast path, midpath, slow path
Mutex: Enforce serialization on shared memory systems
task 0
mutex_unlock()
mutex_lock()
task 1
task N
.
.
task task task
mutex
owner
wait_lock
osq
wait_list
Critical Section
task
rlock
struct spinlock or spinlock_t
atomic_t tail;
optimistic_spin_queue
Lock owner
Protect accessing members
of mutex struct
[midpath] spinning: busy-waiting
[slow path] waiting tasks: sleep
(non-busy waiting)
Mutex Implementation in Linux
• Mutex implementation paths
✓Fastpath: Uncontended case by using cmpxchg(): CAS (Compare and Swap)
✓Midpath (optimistic spinning) - The priority of the lock owner is the highest one
➢Spin for mutex lock acquisition when the lock owner is running.
➢The lock owner is likely to release the lock soon.
➢Leverage cancelable MCS lock (OSQ - Optimistic Spin Queue: MCS-like lock): v3.15
✓Slowpath: The task is added to the waiting queue and sleeps until woken up by the
unlock path
• Mutex is a hybrid type (spinning & sleeping): Busy-waiting for a few cycles
instead of immediately sleeping
• Ownership: Only the lock owner can release the lock
• kernel/locking/{mutex.c, osq_lock.c}
• Reference: Generic Mutex Subsystem
mutex_lock
__mutex_trylock_fast
atomic_long_try_cmpxchg_acquire
__mutex_lock_slowpath
Fail
__mutex_lock
__mutex_lock_common
preempt_disable
__mutex_trylock() ||
mutex_optimistic_spin()
preempt_enable
return
add the task to wait_list
Success: fast path
Yes: midpath
No: slow path
mutex_lock(): Call path
mutex_lock
__mutex_trylock_fast
atomic_long_try_cmpxchg_acquire
__mutex_lock_slowpath
Fail
__mutex_lock
__mutex_lock_common
preempt_disable
__mutex_trylock() ||
mutex_optimistic_spin()
preempt_enable
return
add the task to wait_list
Success: fast path
Yes: midpath
No: slow path
mutex_lock(): Fast path
__mutex_trylock
__mutex_trylock_or_owner
mutex_can_spin_on_owner
The lock might be unlocked by another core
mutex_optimistic_spin
osq_lock
__mutex_trylock_or_owner
mutex_spin_on_owner
cpu_relax
osq_unlock
for (;;)
break if getting the lock
There’s an owner: break if the lock
is released or owner goes to sleep
Return true if the following conditions are met
• The spinning task is not preempted: need_resched()
• The lock owner:
✓ Not preempted : checked by vcpu_is_preempted()
✓ Not sleep: checked by owner->on_cpu
• Spinner is spinning on the current lock owner!
mutex_spin_on_owner() returns true → keep looping for
acquiring the lock
• Lock release: one of spinning tasks can get the lock
mutex_spin_on_owner() returns false → break ‘for’ loop
• The spinning task is preempted
• The lock owner is preempted
• The lock owner sleeps
mutex_lock(): midpath
mutex_can_spin_on_owner
mutex_spin_on_owner
__mutex_trylock
__mutex_trylock_or_owner
mutex_can_spin_on_owner
The lock might be unlocked by another core
mutex_optimistic_spin
osq_lock
__mutex_trylock_or_owner
mutex_spin_on_owner
cpu_relax
osq_unlock
for (;;)
break if getting the lock
There’s an owner: break if the lock
is released or owner goes to sleep
mutex_lock(): midpath
Second or later osq_lock() is spinned in this function.
First osq_lock() gets osq lock and spins in this loop.
Notify other osq spinners to get an osq lock.
owner
owner
mutex_unlock()
mutex_lock()
Critical Section
core 0
mutex_unlock()
mutex_lock()
Critical Section
core 1 core 2 core 3
spinning
mutex_unlock()
mutex_lock()
Critical Section
spinning
mutex_unlock()
mutex_lock()
Critical Section
spinning
midpath: [Case #1: ideal] without preemption or sleep (both lock
owner and spinner)
owner
owner
One of spinning tasks can get the lock after the owner releases the lock:
Spinning tasks do not need to be moved to wait list
mutex_unlock()
mutex_lock()
Critical Section
core 0
mutex_unlock()
mutex_lock()
Critical Section
core 1 core 2 core 3
spinning
mutex_unlock()
mutex_lock()
Critical Section
spinning
mutex_unlock()
mutex_lock()
Critical Section
spinning
midpath – [Case #1: ideal] lock release without preemption or
sleep
When to exit the spinning?
1. The lock owner releases the lock
2. The lock owner goes to sleep or is preempted: spinning tasks go to slow path
✓ Check task->on_cpu
✓ Functions: prepare_task(), finish_task()…
3. The spinning task is preempted: the spinning task goes to slow path
✓ need_resched()
owner
owner
owner
owner
Three cases for “cannot spin on
mutex owner”
• The lock owner is preempted
• The spinning task is preempted
• The lock owner sleeps
owner
owner
mutex_unlock()
mutex_lock()
Critical Section
core 0
mutex_unlock()
mutex_lock()
Critical Section
core 1 core 2 core 3
spinning (midpath)
mutex_unlock()
mutex_lock()
Critical Section
spinning (midpath)
mutex_unlock()
mutex_lock()
Critical Section
non-busy wait
(slow path: move
this task to wait
list)
midpath: [Case #2] Mutex lock owner is preempted
owner
owner
Critical Section
preempt
reschedule
wakeup
non-busy wait
(slow path: move
this task to wait
list)
1
2
3
6 4
5
7
8
mutex_unlock()
mutex_lock()
Critical Section
core 0
mutex_unlock()
mutex_lock()
core 1 core 2
spinning (midpath)
mutex_unlock()
mutex_lock()
Critical Section
spinning (midpath)
midpath: [Case #3] Spinner (osq lock owner) is preempted
owner
preempt
1
non-busy wait
(slow path)
Reschedule back
4
3 owner
Critical Section
__mutex_unlock_slowpath ->
wake_up_q
5
core 3
mutex_unlock()
mutex_lock()
Critical Section
spinning (midpath)
6 owner
Reschedule:
schedule_preempt_disabled()
2
Three cases for “cannot spin on
mutex owner”
• The lock owner is preempted
• The spinning task is preempted
• The lock owner sleeps
mutex_unlock()
mutex_lock()
Critical Section
core 0
mutex_unlock()
mutex_lock()
core 1 core 2
spinning (midpath)
mutex_unlock()
mutex_lock()
Critical Section
spinning (midpath)
midpath: [Case #3] Spinner (osq lock owner) is preempted
owner
preempt
1
Reschedule:
schedule_preempt_disabled()
2
non-busy wait
(slow path)
Reschedule back
4
3 owner
Critical Section
__mutex_unlock_slowpath ->
wake_up_q
5
core 3
mutex_unlock()
mutex_lock()
Critical Section
spinning (midpath)
6 owner
Who sets TIF_NEED_RESCHED? → set_tsk_need_resched()
1. Call path
✓ timer_interrupt → tick_handle_periodic → tick_periodic →
update_process_times → scheduler_tick → curr->sched_class-
>task_tick → task_tick_fair → entity_tick -> check_preempt_tick
-> resched_curr -> set_tsk_need_resched
✓ HW interrupt (not timer HW) → wake up a higher priority task
2. Users:
✓ check_preempt_tick(), check_preempt_wakeup(),
wake_up_process()….and so on.
Who sets TIF_NEED_RESCHED? full call path
Who sets TIF_NEED_RESCHED?
Who sets TIF_NEED_RESCHED? → set_tsk_need_resched()
1. Call path
✓ timer_interrupt → tick_handle_periodic → tick_periodic
→ update_process_times → scheduler_tick → curr-
>sched_class->task_tick → task_tick_fair → entity_tick ->
check_preempt_tick -> resched_curr ->
set_tsk_need_resched
✓ HW interrupt (not timer HW) → wake up a higher priority
task
2. Users:
✓ check_preempt_tick(), check_preempt_wakeup(),
wake_up_process()….and so on.
Set TIF_NEED_RESCHED: current task will be rescheduled later
PREEMPT_NEED_RESCHED bit = 0 → Need to reschedule (check comments in this header)
Who sets TIF_NEED_RESCHED?
• Set TIF_NEED_RESCHED flag if the delta is greater than
ideal_runtime
✓ The running task will be scheduled out.
Who sets TIF_NEED_RESCHED?
Who sets TIF_NEED_RESCHED? → set_tsk_need_resched()
1. Call path
✓ timer_interrupt → tick_handle_periodic → tick_periodic →
update_process_times → scheduler_tick → curr->sched_class-
>task_tick → task_tick_fair → entity_tick -> check_preempt_tick
-> resched_curr -> set_tsk_need_resched
✓ HW interrupt (not timer HW) → wake up a higher priority task
2. Users:
✓ check_preempt_tick(), check_preempt_wakeup(),
wake_up_process()….and so on.
Who sets TIF_NEED_RESCHED?
Three cases for “cannot spin on
mutex owner”
• The lock owner is preempted
• The spinning task is preempted
• The lock owner sleeps
[Case #4] Locker owner sleeps (reschedule): A test kernel module
The action of sleep is identical to preemption and “wait for IO”: reschedule
Create 4 kernel threads
Source code (github): test-modules/mutex/mutex.c
mutex_unlock()
mutex_lock()
core 0 core 1 core 2
owner
1
mutex_optimistic_spin() ->
mutex_can_spin_on_owner() returns fail
Owner
6
core 3
[Case #4] Locker owner sleeps (reschedule): other tasks cannot spin
kthread_0 kthread_1 kthread_2 kthread_3
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
mutex_unlock()
mutex_lock()
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
non-busy wait
(slow path): lock owner’s
task->cpu_on = 0
mutex_unlock()
mutex_lock()
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
non-busy wait
(slow path): lock owner’s
task->cpu_on = 0
mutex_unlock()
mutex_lock()
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
non-busy wait
(slow path): lock owner’s
task->cpu_on = 0
2
3
4
mutex_optimistic_spin() ->
mutex_can_spin_on_owner() returns fail
5
mutex_optimistic_spin() ->
mutex_can_spin_on_owner()
returns fail
Owner
7
Owner
8
__mutex_trylock() ||
mutex_optimistic_spin()
preempt_enable
return
add the task to wait_list
Yes: midpath
No: slow path
Call path
[Case #4] Locker owner sleeps (reschedule): gdb
watchpoint: task->on_cpu → who changes this?
[Case #4] Locker owner sleeps (reschedule): Who changes task->on_cpu?
task->on_cpu is set 0 during context switch
mutex_unlock()
mutex_lock()
core 0 core 1
owner
1
mutex_optimistic_spin() ->
mutex_can_spin_on_owner() returns fail
Owner
6
[Case #4] Locker owner sleeps (reschedule): gdb: other tasks cannot spin
kthread_0 kthread_1
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
mutex_unlock()
mutex_lock()
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
non-busy wait
(slow path): lock owner’s
task->cpu_on = 0
2
3
__mutex_trylock() ||
mutex_optimistic_spin()
preempt_enable
return
add the task to wait_list
Yes: midpath
No: slow path
Call path
mutex_unlock()
mutex_lock()
core 0 core 1
owner
1
mutex_optimistic_spin() ->
mutex_can_spin_on_owner() returns fail
Owner
6
[Case #4] Locker owner sleeps (reschedule): gdb: other tasks cannot spin
kthread_0 kthread_1
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
mutex_unlock()
mutex_lock()
Critical Section
msleep(): reschedule
task->cpu_on = 1 0
non-busy wait
(slow path): lock owner’s
task->cpu_on = 0
2
3
__mutex_trylock() ||
mutex_optimistic_spin()
preempt_enable
return
add the task to wait_list
Yes: midpath
No: slow path
Call path
retval = 0 → cannot spin this owner
owner->on_cpu = 0
mutex_unlock()
mutex_unlock
__mutex_unlock_fast
mutex_unlock(): Call path
return
mutex
owner
wait_lock
osq
wait_list
task
• task_struct pointers aligns to at least L1_CACHE_BYTES
• 3 LSB bits are used for non-empty waiter list
✓ W (MUTEX_FLAG_WAITERS)
◼ Non-empty waiter list. Issue a wakeup when unlocking
✓ H (MUTEX_FLAG_HANDOFF)
◼ Unlock needs to hand the lock to the top-waiter
◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order.
✓ P (MUTEX_FLAG_PICKUP)
◼ Handoff has been done and we're waiting for pickup
◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order.
locker->owner = 0
__mutex_unlock_slowpath
Have waiters: One of 3-bit LSB
of lock->owner is not cleared.
spin_lock(&lock->wait_lock)
spin_unlock(&lock->wait_lock)
Get a waiter from lock->wait_list
wake_up_q
wake_up_process
owner
task virtual addr W
P H
0
1
2
63
lock->owner
__mutex_handoff Called if MUTEX_FLAG_HANDOFF is set
atomic_long_cmpxchg_release(
&lock->owner, owner,
__owner_flags(owner))
The woken task will update lock->owner
Set 3-bit LSB of lock->owner → Clear the
original task struct address
* ww_mutex (Wound/Wait Mutex): Deadlock-proof mutex
[Unlock task = lock->owner]
No waiter: 3-bit LSB of lock->owner are cleared
mutex_unlock
__mutex_unlock_fast
mutex_unlock(): fast path
return
locker->owner = 0
[fast path] A spinner will take the lock
[Unlock task = lock->owner]
No waiter: 3-bit LSB of lock->owner are cleared
mutex_unlock
__mutex_unlock_fast
mutex_unlock(): slow path
return
locker->owner = 0
__mutex_unlock_slowpath
Have waiters: One of 3-bit LSB
of lock->owner is not cleared.
spin_lock(&lock->wait_lock)
spin_unlock(&lock->wait_lock)
Get a waiter from lock->wait_list
wake_up_q
wake_up_process
__mutex_handoff Called if MUTEX_FLAG_HANDOFF is set
owner
task virtual addr 1
0 0
0
1
2
63
atomic_long_cmpxchg_release(
&lock->owner, owner,
__owner_flags(owner)
owner
0 1
0 0
0
1
2
63
The woken task will update lock->owner
mutex
owner
wait_lock
osq
wait_list
task
• task_struct pointers aligns to at least L1_CACHE_BYTES
• 3 LSB bits are used for non-empty waiter list
✓ W (MUTEX_FLAG_WAITERS)
◼ Non-empty waiter list. Issue a wakeup when unlocking
✓ H (MUTEX_FLAG_HANDOFF)
◼ Unlock needs to hand the lock to the top-waiter
◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order.
✓ P (MUTEX_FLAG_PICKUP)
◼ Handoff has been done and we're waiting for pickup
◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order.
owner
task virtual addr W
P H
0
1
2
63
lock>-owner
* ww_mutex (Wound/Wait Mutex): Deadlock-proof mutex
[Unlock task = lock->owner]
No waiter: 3-bit LSB of lock->owner are cleared
mutex_unlock
__mutex_unlock_fast
[Unlock task = lock->owner]
No waiter: 3-bit LSB of lock->owner are cleared
return
locker->owner = 0
__mutex_unlock_slowpath
Have waiters: One of 3-bit LSB
of lock-owner is not cleared.
spin_lock(&lock->wait_lock)
spin_unlock(&lock->wait_lock)
Get a waiter from lock->wait_list
wake_up_q
wake_up_process
__mutex_handoff Called if MUTEX_FLAG_HANDOFF is set
owner
task virtual addr 1
0 0
0
1
2
63
atomic_long_cmpxchg_release(
&lock->owner, owner,
__owner_flags(owner)
owner
0 1
0 0
0
1
2
63
The woken task will update lock->owner
mutex_unlock(): slow path
Resume here: kthread_0 wakes up kthread_1
Woken task
1 Context switch
2
Update lock->owner
gdb watchpoint: lock->owner
Update lock->owner
* Bit 0 is still set (MUTEX_FLAG_WAITERS): The upcoming
mutex_unlock() will wake up the waiter instead of spinner.
* Bit 1 (MUTEX_FLAG_HANDOFF) is cleared from
__mutex_trylock->__mutex_trylock_or_owner.
Update lock->owner
When/who clears 3-bit LSB of lock->owner?
Woken task: When/who to clear 3-bit LSB of lock-owner?
Clear 3-bit LSB of lock->owner if no waiters
mutex_unlock
__mutex_unlock_fast
mutex_unlock(): Mutex ownership
return
locker->owner = 0
__mutex_unlock_slowpath
Have waiters: One of 3-bit LSB
of lock->owner is not cleared.
spin_lock(&lock->wait_lock)
spin_unlock(&lock->wait_lock)
Get a waiter from lock->wait_list
wake_up_q
wake_up_process
__mutex_handoff Called if MUTEX_FLAG_HANDOFF is set
atomic_long_cmpxchg_release(
&lock->owner, owner,
__owner_flags(owner))
The woken task will update lock->owner
Set 3-bit LSB of lock->owner → Clear the
original task struct address
[Unlock task = lock->owner]
No waiter: 3-bit LSB of lock->owner are cleared
• [Fastpath] Check ownership of a mutex
• [Slowpath] Does not check ownership of a mutex
mutex_unlock(): [lab] Behavior observation when lock owner !=
unlocker’s task (Mutex ownership)
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
Source code: test-modules/mutex-unlock-by-another-task/mutex.c
This scenario is created on purpose for
demonstration. It won’t happen in real case.
Note
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
Can another task
unlock the mutex?
mutex_unlock(): [lab] Behavior observation when lock owner !=
unlocker’s task (Mutex ownership)
This scenario is created on purpose for
demonstration. It won’t happen in real case.
Note
__mutex_unlock_slowpath
spin_lock(&lock->wait_lock)
spin_unlock(&lock->wait_lock)
Get a waiter from lock->wait_list
wake_up_q
wake_up_process
__mutex_handoff
atomic_long_cmpxchg_release(
&lock->owner, owner,
__owner_flags(owner))
mutex_unlock(): slow path does not check unlocker’s ownership
[DEBUG_MUTEXES] Print a warning message if unlocker’s task != lock owner’s task
breakpoint
breakpoint
1
1
Lock owner
2
Unlocker’s task != lock owner
3
mutex_unlock(): [lab] Behavior when lock owner != unlocker’s task
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
We’re here
breakpoint
1
breakpoint
1
lock owner = old
2
lock->owner is set 0 by atomic_long_cmpxchg_release()
3
mutex_unlock(): [lab] Behavior if lock owner != unlocker’s task
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
We’re here
mutex_unlock(): [lab] Behavior observation when lock owner !=
unlocker’s task (Mutex ownership)
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds
mutex_unlock()
sleep 1 second
lock_thread unlock_thread
1
2
3
This unlocks the
locked mutex
mutex
owner = 0
wait_lock = 0
osq = 0
wait_list
Breakpoint stops at second mutex_unlock() in kthread_0
mutex_unlock(): [lab] Behavior observation when lock owner !=
unlocker’s task (Mutex ownership)
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
1
2
3
This unlocks the locked mutex
What’s the behavior?
breakpoint
1
mutex_unlock(): [lab] Behavior when lock owner != unlocker’s task
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
breakpoint
1
No lock owner
2
We’re here
breakpoint
1
mutex_unlock(): [lab] Behavior if lock owner != unlocker’s task
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
breakpoint
1
lock owner = old
2
lock->owner is set 0 by atomic_long_cmpxchg_release()
3
4
We’re here
mutex_unlock(): [lab] Behavior observation when lock owner !=
unlocker’s task (Mutex ownership)
mutex_unlock()
mutex_lock()
core 0 core 1
kthread_0 kthread_1
Critical Section
sleep 3 seconds mutex_unlock()
sleep 1 second
lock_thread unlock_thread
Takeaways
1. [Fastpath] Linux kernel checks mutex’s ownership
2. [Slowpath] Linux kernel does not check mutex’s ownership when unlocking a mutex
✓ Developers must take care of mutex_lock/mutex_unlock pair
✓ Slowpath prints a warning message if mutex debug option is enabled.
✓ Different from the concept: Only the lock owner has the permission to unlock the mutex
mutex_unlock(): [lab] Behavior observation when lock owner !=
unlocker’s task (Mutex ownership)
Why doesn’t slowpath check ownership?
1. Ownership checking is only for developers and not enforced by Linux kernel.
✓ Developers need to take care of it.
Quotes
1. From Generic Mutex Subsystem
✓ Mutex Semantics
• Only one task can hold the mutex at a time.
• Only the owner can unlock the mutex.
• …
✓ These semantics are fully enforced when CONFIG_DEBUG_MUTEXES is enabled
Think about…
mutex_unlock()
mutex_lock()
core 0 core 1
Critical Section
sleep 3 seconds
mutex_unlock()
sleep 1 second
mutex_unlock()
mutex_lock()
core 2
Critical Section
1 Will it acquire the mutex lock
successfully?
2
Will this unlock the mutex
lock acquired by core 2?
Q&A #1: Semaphore can be used to synchronize with user-space
Consumer
buffer
Producer
up()
down()
[Semaphore] Producer/Consumer Concept
* Screenshot captured from: Chapter 10, Linux Kernel Development, 3rd Edition, Robert Love
• Consumer waits if buffer is empty
• Producer waits if buffer is full
• Only one process can manipulate the buffer at a time
(mutual exclusion)
Principle
Q&A #1: Semaphore can be used to synchronize with user-space
Consumer
buffer
Producer
up(), V
down(), P
[Semaphore] Producer/Consumer Concept
Code reference: CS 537 Notes, Section #6: Semaphores and Producer/Consumer Problem
Q&A #1: Semaphore can be used to synchronize with user-space
Consumer
buffer
Producer
up(), V
down(), P
[Semaphore] Producer/Consumer Concept
Data structure synchronization
Notify consumer
Wait if buffer is full Wait if buffer is empty
Notify producer
Code reference: CS 537 Notes, Section #6: Semaphores and Producer/Consumer Problem
Q&A #1: Semaphore can be used to synchronize with user-space
Consumer
buffer
Producer
up(), V
down(), P
[Semaphore] Producer/Consumer Concept
User Space
Kernel Space
. . .
buffer
Consumer Consumer
Producer
system
call
system
call
Possible Scenario
Note
1. up()/down() invocations are done in kernel.
Q&A #2: Mutex isn’t suitable for synchronizations
between kernel and user-space
* Screenshot captured from: Chapter 10, Linux Kernel Development, 3rd Edition, Robert Love
Explanation
1. [Mutex] ownership!
✓ Whoever locked a mutex must unlock it
Reference
• Generic Mutex Subsystem
• Wound/Wait Deadlock-Proof Mutex Design
• Mutexes and Semaphores Demystified
• MCS locks and qspinlocks
• Linux中的mutex机制[一] - 加锁和osq lock
Backup
__mutex_trylock
signal_pending_state
spin_unlock(&lock->wait_lock)
spin_lock(&lock->wait_lock)
for (;;)
goto ‘acquired’ lable if getting the lock
Interrupted by signal: break
mutex_lock(): slowpath
__mutex_add_waiter(lock, &waiter, &lock-
>wait_list)
waiter.task = current
set_current_state(state)
schedule_preempt_disabled
__mutex_set_flag(lock,
MUTEX_FLAG_HANDOFF)
__mutex_trylock() ||
(first && mutex_optimistic_spin())
spin_lock(&lock->wait_lock)
N Y: break
spin_lock(&lock->wait_lock)
acquired
__set_current_state(TASK_RUNNING)
mutex_remove_waiter
list_empty(&lock->wait_list)
__mutex_clear_flag(lock,
MUTEX_FLAGS)
spin_lock(&lock->wait_lock)
preempt_enable

Contenu connexe

Tendances

Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdfAdrian Huang
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
 
Memory Management with Page Folios
Memory Management with Page FoliosMemory Management with Page Folios
Memory Management with Page FoliosAdrian Huang
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...Adrian Huang
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux KernelAdrian Huang
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File SystemAdrian Huang
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfAdrian Huang
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)shimosawa
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionGene Chang
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBshimosawa
 
Linux kernel memory allocators
Linux kernel memory allocatorsLinux kernel memory allocators
Linux kernel memory allocatorsHao-Ran Liu
 
U-Boot Porting on New Hardware
U-Boot Porting on New HardwareU-Boot Porting on New Hardware
U-Boot Porting on New HardwareRuggedBoardGroup
 
How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)Gavin Guo
 
Jagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratchJagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratchlinuxlab_conf
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelSUSE Labs Taipei
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)shimosawa
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory ManagementNi Zo-Ma
 

Tendances (20)

Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
 
Memory Management with Page Folios
Memory Management with Page FoliosMemory Management with Page Folios
Memory Management with Page Folios
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
 
Linux kernel memory allocators
Linux kernel memory allocatorsLinux kernel memory allocators
Linux kernel memory allocators
 
U-Boot Porting on New Hardware
U-Boot Porting on New HardwareU-Boot Porting on New Hardware
U-Boot Porting on New Hardware
 
How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)How to use KASAN to debug memory corruption in OpenStack environment- (2)
How to use KASAN to debug memory corruption in OpenStack environment- (2)
 
Jagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratchJagan Teki - U-boot from scratch
Jagan Teki - U-boot from scratch
 
ACPI Debugging from Linux Kernel
ACPI Debugging from Linux KernelACPI Debugging from Linux Kernel
ACPI Debugging from Linux Kernel
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Linux device drivers
Linux device driversLinux device drivers
Linux device drivers
 
Kernel crashdump
Kernel crashdumpKernel crashdump
Kernel crashdump
 

Similaire à semaphore & mutex.pdf

Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Sneeker Yeh
 
An Introduction to Locks in Go
An Introduction to Locks in GoAn Introduction to Locks in Go
An Introduction to Locks in GoYu-Shuan Hsieh
 
Synchronization problem with threads
Synchronization problem with threadsSynchronization problem with threads
Synchronization problem with threadsSyed Zaid Irshad
 
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Continuent
 
Locks (Concurrency)
Locks (Concurrency)Locks (Concurrency)
Locks (Concurrency)Sri Prasanna
 
Linux synchronization tools
Linux synchronization toolsLinux synchronization tools
Linux synchronization toolsmukul bhardwaj
 
Much Ado About Blocking: Wait/Wakke in the Linux Kernel
Much Ado About Blocking: Wait/Wakke in the Linux KernelMuch Ado About Blocking: Wait/Wakke in the Linux Kernel
Much Ado About Blocking: Wait/Wakke in the Linux KernelDavidlohr Bueso
 
Training Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryTraining Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryContinuent
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linuxSusant Sahani
 
Locked down openSUSE Tumbleweed kernel
Locked down openSUSE Tumbleweed kernelLocked down openSUSE Tumbleweed kernel
Locked down openSUSE Tumbleweed kernelSUSE Labs Taipei
 
Userspace adaptive spinlocks with rseq
Userspace adaptive spinlocks with rseqUserspace adaptive spinlocks with rseq
Userspace adaptive spinlocks with rseqIgalia
 
Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410huangachou
 
Linux kernel development chapter 10
Linux kernel development chapter 10Linux kernel development chapter 10
Linux kernel development chapter 10huangachou
 
AOS Lab 6: Scheduling
AOS Lab 6: SchedulingAOS Lab 6: Scheduling
AOS Lab 6: SchedulingZubair Nabi
 
Wait for your fortune without Blocking!
Wait for your fortune without Blocking!Wait for your fortune without Blocking!
Wait for your fortune without Blocking!Roman Elizarov
 
Java util concurrent
Java util concurrentJava util concurrent
Java util concurrentRoger Xia
 
AOS Lab 4: If you liked it, then you should have put a “lock” on it
AOS Lab 4: If you liked it, then you should have put a “lock” on itAOS Lab 4: If you liked it, then you should have put a “lock” on it
AOS Lab 4: If you liked it, then you should have put a “lock” on itZubair Nabi
 
Concurrency 2010
Concurrency 2010Concurrency 2010
Concurrency 2010敬倫 林
 
How to debug ocfs2 hang problem
How to debug ocfs2 hang problemHow to debug ocfs2 hang problem
How to debug ocfs2 hang problemGang He
 

Similaire à semaphore & mutex.pdf (20)

Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
 
An Introduction to Locks in Go
An Introduction to Locks in GoAn Introduction to Locks in Go
An Introduction to Locks in Go
 
Kernel
KernelKernel
Kernel
 
Synchronization problem with threads
Synchronization problem with threadsSynchronization problem with threads
Synchronization problem with threads
 
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
 
Locks (Concurrency)
Locks (Concurrency)Locks (Concurrency)
Locks (Concurrency)
 
Linux synchronization tools
Linux synchronization toolsLinux synchronization tools
Linux synchronization tools
 
Much Ado About Blocking: Wait/Wakke in the Linux Kernel
Much Ado About Blocking: Wait/Wakke in the Linux KernelMuch Ado About Blocking: Wait/Wakke in the Linux Kernel
Much Ado About Blocking: Wait/Wakke in the Linux Kernel
 
Training Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryTraining Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & Recovery
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linux
 
Locked down openSUSE Tumbleweed kernel
Locked down openSUSE Tumbleweed kernelLocked down openSUSE Tumbleweed kernel
Locked down openSUSE Tumbleweed kernel
 
Userspace adaptive spinlocks with rseq
Userspace adaptive spinlocks with rseqUserspace adaptive spinlocks with rseq
Userspace adaptive spinlocks with rseq
 
Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410
 
Linux kernel development chapter 10
Linux kernel development chapter 10Linux kernel development chapter 10
Linux kernel development chapter 10
 
AOS Lab 6: Scheduling
AOS Lab 6: SchedulingAOS Lab 6: Scheduling
AOS Lab 6: Scheduling
 
Wait for your fortune without Blocking!
Wait for your fortune without Blocking!Wait for your fortune without Blocking!
Wait for your fortune without Blocking!
 
Java util concurrent
Java util concurrentJava util concurrent
Java util concurrent
 
AOS Lab 4: If you liked it, then you should have put a “lock” on it
AOS Lab 4: If you liked it, then you should have put a “lock” on itAOS Lab 4: If you liked it, then you should have put a “lock” on it
AOS Lab 4: If you liked it, then you should have put a “lock” on it
 
Concurrency 2010
Concurrency 2010Concurrency 2010
Concurrency 2010
 
How to debug ocfs2 hang problem
How to debug ocfs2 hang problemHow to debug ocfs2 hang problem
How to debug ocfs2 hang problem
 

Dernier

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 

Dernier (20)

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 

semaphore & mutex.pdf

  • 1. * Based on kernel 5.11 (x86_64) – QEMU * 2-socket CPUs (2 cores/socket) * 16GB memory * Kernel parameter: nokaslr norandmaps * KASAN: disabled * Userspace: ASLR is disabled * Legacy BIOS Linux Synchronization Mechanism: Semaphore & Mutex Adrian Huang | Feb, 2023
  • 2. Agenda • Semaphore ✓producer-consumer problem ✓Implementation in Linux kernel • Mutex (introduced in v2.6.16) ✓Enforce serialization on shared memory systems ✓Implementation in Linux kernel ✓Mutex lock ➢Fast path, midpath, slow path ✓Mutex unlock ➢Fast path and slow path ➢Mutex ownership (with a lab) ◼ Re-visit this concept: Only the lock owner has the permission to unlock the mutex ✓Q & A
  • 3. Semaphore: producer-consumer problem task 0 Semaphore wakeup/signal #0 #1 #2 #3 Share resource counter = 4 wait_list wait/sleep P, down() V, up() task 1 task N . . task task task • Sleeping lock • Used in process context *ONLY* • Cannot hold a spin lock while acquiring a semaphore • Mainly use in producer-consumer scenario • The lock holder does not require to unlock the lock. (non-ownership concept) ✓ Something like notification
  • 4. down __down raw_spin_lock_irqsave sem->count > 0 semaphore lock count wait_list semaphore_waiter list task struct raw_spinlock or raw_spinlock_t raw_lock sem->count-- raw_spin_unlock_irqrestore up Semaphore Implementation in Linux Kernel __down_common Y N
  • 6. Semaphore Implementation in Linux Kernel [Only for interruptible and wakekill task] Check if the sleeping task gets a signal
  • 7. Semaphore Implementation in Linux Kernel 1 Protect sem->count data 2 Reschedule: Need to unlock spinlock down __down raw_spin_lock_irqsave sem->count > 0 sem->count-- raw_spin_unlock_irqrestore __down_common Y N
  • 8. Semaphore Implementation in Linux Kernel Protect sem->count data up __up raw_spin_lock_irqsave wait_list empty? sem->count++ raw_spin_unlock_irqrestore Y N semaphore lock count wait_list semaphore_waiter list task struct raw_spinlock or raw_spinlock_t raw_lock up
  • 9. Agenda • Semaphore ✓producer-consumer problem ✓Implementation in Linux kernel • Mutex (introduced in v2.6.16) ✓Enforce serialization on shared memory systems ✓Implementation in Linux kernel ✓Call path ➢Fast path, midpath, slow path
  • 10. Mutex: Enforce serialization on shared memory systems task 0 mutex_unlock() mutex_lock() task 1 task N . . task task task mutex owner wait_lock osq wait_list Critical Section task rlock struct spinlock or spinlock_t atomic_t tail; optimistic_spin_queue Lock owner Protect accessing members of mutex struct [midpath] spinning: busy-waiting [slow path] waiting tasks: sleep (non-busy waiting)
  • 11. Mutex Implementation in Linux • Mutex implementation paths ✓Fastpath: Uncontended case by using cmpxchg(): CAS (Compare and Swap) ✓Midpath (optimistic spinning) - The priority of the lock owner is the highest one ➢Spin for mutex lock acquisition when the lock owner is running. ➢The lock owner is likely to release the lock soon. ➢Leverage cancelable MCS lock (OSQ - Optimistic Spin Queue: MCS-like lock): v3.15 ✓Slowpath: The task is added to the waiting queue and sleeps until woken up by the unlock path • Mutex is a hybrid type (spinning & sleeping): Busy-waiting for a few cycles instead of immediately sleeping • Ownership: Only the lock owner can release the lock • kernel/locking/{mutex.c, osq_lock.c} • Reference: Generic Mutex Subsystem
  • 14. __mutex_trylock __mutex_trylock_or_owner mutex_can_spin_on_owner The lock might be unlocked by another core mutex_optimistic_spin osq_lock __mutex_trylock_or_owner mutex_spin_on_owner cpu_relax osq_unlock for (;;) break if getting the lock There’s an owner: break if the lock is released or owner goes to sleep Return true if the following conditions are met • The spinning task is not preempted: need_resched() • The lock owner: ✓ Not preempted : checked by vcpu_is_preempted() ✓ Not sleep: checked by owner->on_cpu • Spinner is spinning on the current lock owner! mutex_spin_on_owner() returns true → keep looping for acquiring the lock • Lock release: one of spinning tasks can get the lock mutex_spin_on_owner() returns false → break ‘for’ loop • The spinning task is preempted • The lock owner is preempted • The lock owner sleeps mutex_lock(): midpath mutex_can_spin_on_owner mutex_spin_on_owner
  • 15. __mutex_trylock __mutex_trylock_or_owner mutex_can_spin_on_owner The lock might be unlocked by another core mutex_optimistic_spin osq_lock __mutex_trylock_or_owner mutex_spin_on_owner cpu_relax osq_unlock for (;;) break if getting the lock There’s an owner: break if the lock is released or owner goes to sleep mutex_lock(): midpath Second or later osq_lock() is spinned in this function. First osq_lock() gets osq lock and spins in this loop. Notify other osq spinners to get an osq lock.
  • 16. owner owner mutex_unlock() mutex_lock() Critical Section core 0 mutex_unlock() mutex_lock() Critical Section core 1 core 2 core 3 spinning mutex_unlock() mutex_lock() Critical Section spinning mutex_unlock() mutex_lock() Critical Section spinning midpath: [Case #1: ideal] without preemption or sleep (both lock owner and spinner) owner owner One of spinning tasks can get the lock after the owner releases the lock: Spinning tasks do not need to be moved to wait list
  • 17. mutex_unlock() mutex_lock() Critical Section core 0 mutex_unlock() mutex_lock() Critical Section core 1 core 2 core 3 spinning mutex_unlock() mutex_lock() Critical Section spinning mutex_unlock() mutex_lock() Critical Section spinning midpath – [Case #1: ideal] lock release without preemption or sleep When to exit the spinning? 1. The lock owner releases the lock 2. The lock owner goes to sleep or is preempted: spinning tasks go to slow path ✓ Check task->on_cpu ✓ Functions: prepare_task(), finish_task()… 3. The spinning task is preempted: the spinning task goes to slow path ✓ need_resched() owner owner owner owner
  • 18. Three cases for “cannot spin on mutex owner” • The lock owner is preempted • The spinning task is preempted • The lock owner sleeps
  • 19. owner owner mutex_unlock() mutex_lock() Critical Section core 0 mutex_unlock() mutex_lock() Critical Section core 1 core 2 core 3 spinning (midpath) mutex_unlock() mutex_lock() Critical Section spinning (midpath) mutex_unlock() mutex_lock() Critical Section non-busy wait (slow path: move this task to wait list) midpath: [Case #2] Mutex lock owner is preempted owner owner Critical Section preempt reschedule wakeup non-busy wait (slow path: move this task to wait list) 1 2 3 6 4 5 7 8
  • 20. mutex_unlock() mutex_lock() Critical Section core 0 mutex_unlock() mutex_lock() core 1 core 2 spinning (midpath) mutex_unlock() mutex_lock() Critical Section spinning (midpath) midpath: [Case #3] Spinner (osq lock owner) is preempted owner preempt 1 non-busy wait (slow path) Reschedule back 4 3 owner Critical Section __mutex_unlock_slowpath -> wake_up_q 5 core 3 mutex_unlock() mutex_lock() Critical Section spinning (midpath) 6 owner Reschedule: schedule_preempt_disabled() 2
  • 21. Three cases for “cannot spin on mutex owner” • The lock owner is preempted • The spinning task is preempted • The lock owner sleeps
  • 22. mutex_unlock() mutex_lock() Critical Section core 0 mutex_unlock() mutex_lock() core 1 core 2 spinning (midpath) mutex_unlock() mutex_lock() Critical Section spinning (midpath) midpath: [Case #3] Spinner (osq lock owner) is preempted owner preempt 1 Reschedule: schedule_preempt_disabled() 2 non-busy wait (slow path) Reschedule back 4 3 owner Critical Section __mutex_unlock_slowpath -> wake_up_q 5 core 3 mutex_unlock() mutex_lock() Critical Section spinning (midpath) 6 owner Who sets TIF_NEED_RESCHED? → set_tsk_need_resched() 1. Call path ✓ timer_interrupt → tick_handle_periodic → tick_periodic → update_process_times → scheduler_tick → curr->sched_class- >task_tick → task_tick_fair → entity_tick -> check_preempt_tick -> resched_curr -> set_tsk_need_resched ✓ HW interrupt (not timer HW) → wake up a higher priority task 2. Users: ✓ check_preempt_tick(), check_preempt_wakeup(), wake_up_process()….and so on.
  • 23. Who sets TIF_NEED_RESCHED? full call path
  • 24. Who sets TIF_NEED_RESCHED? Who sets TIF_NEED_RESCHED? → set_tsk_need_resched() 1. Call path ✓ timer_interrupt → tick_handle_periodic → tick_periodic → update_process_times → scheduler_tick → curr- >sched_class->task_tick → task_tick_fair → entity_tick -> check_preempt_tick -> resched_curr -> set_tsk_need_resched ✓ HW interrupt (not timer HW) → wake up a higher priority task 2. Users: ✓ check_preempt_tick(), check_preempt_wakeup(), wake_up_process()….and so on.
  • 25. Set TIF_NEED_RESCHED: current task will be rescheduled later PREEMPT_NEED_RESCHED bit = 0 → Need to reschedule (check comments in this header) Who sets TIF_NEED_RESCHED?
  • 26. • Set TIF_NEED_RESCHED flag if the delta is greater than ideal_runtime ✓ The running task will be scheduled out. Who sets TIF_NEED_RESCHED?
  • 27. Who sets TIF_NEED_RESCHED? → set_tsk_need_resched() 1. Call path ✓ timer_interrupt → tick_handle_periodic → tick_periodic → update_process_times → scheduler_tick → curr->sched_class- >task_tick → task_tick_fair → entity_tick -> check_preempt_tick -> resched_curr -> set_tsk_need_resched ✓ HW interrupt (not timer HW) → wake up a higher priority task 2. Users: ✓ check_preempt_tick(), check_preempt_wakeup(), wake_up_process()….and so on. Who sets TIF_NEED_RESCHED?
  • 28. Three cases for “cannot spin on mutex owner” • The lock owner is preempted • The spinning task is preempted • The lock owner sleeps
  • 29. [Case #4] Locker owner sleeps (reschedule): A test kernel module The action of sleep is identical to preemption and “wait for IO”: reschedule Create 4 kernel threads Source code (github): test-modules/mutex/mutex.c
  • 30. mutex_unlock() mutex_lock() core 0 core 1 core 2 owner 1 mutex_optimistic_spin() -> mutex_can_spin_on_owner() returns fail Owner 6 core 3 [Case #4] Locker owner sleeps (reschedule): other tasks cannot spin kthread_0 kthread_1 kthread_2 kthread_3 Critical Section msleep(): reschedule task->cpu_on = 1 0 mutex_unlock() mutex_lock() Critical Section msleep(): reschedule task->cpu_on = 1 0 non-busy wait (slow path): lock owner’s task->cpu_on = 0 mutex_unlock() mutex_lock() Critical Section msleep(): reschedule task->cpu_on = 1 0 non-busy wait (slow path): lock owner’s task->cpu_on = 0 mutex_unlock() mutex_lock() Critical Section msleep(): reschedule task->cpu_on = 1 0 non-busy wait (slow path): lock owner’s task->cpu_on = 0 2 3 4 mutex_optimistic_spin() -> mutex_can_spin_on_owner() returns fail 5 mutex_optimistic_spin() -> mutex_can_spin_on_owner() returns fail Owner 7 Owner 8 __mutex_trylock() || mutex_optimistic_spin() preempt_enable return add the task to wait_list Yes: midpath No: slow path Call path
  • 31. [Case #4] Locker owner sleeps (reschedule): gdb watchpoint: task->on_cpu → who changes this?
  • 32. [Case #4] Locker owner sleeps (reschedule): Who changes task->on_cpu? task->on_cpu is set 0 during context switch
  • 33. mutex_unlock() mutex_lock() core 0 core 1 owner 1 mutex_optimistic_spin() -> mutex_can_spin_on_owner() returns fail Owner 6 [Case #4] Locker owner sleeps (reschedule): gdb: other tasks cannot spin kthread_0 kthread_1 Critical Section msleep(): reschedule task->cpu_on = 1 0 mutex_unlock() mutex_lock() Critical Section msleep(): reschedule task->cpu_on = 1 0 non-busy wait (slow path): lock owner’s task->cpu_on = 0 2 3 __mutex_trylock() || mutex_optimistic_spin() preempt_enable return add the task to wait_list Yes: midpath No: slow path Call path
  • 34. mutex_unlock() mutex_lock() core 0 core 1 owner 1 mutex_optimistic_spin() -> mutex_can_spin_on_owner() returns fail Owner 6 [Case #4] Locker owner sleeps (reschedule): gdb: other tasks cannot spin kthread_0 kthread_1 Critical Section msleep(): reschedule task->cpu_on = 1 0 mutex_unlock() mutex_lock() Critical Section msleep(): reschedule task->cpu_on = 1 0 non-busy wait (slow path): lock owner’s task->cpu_on = 0 2 3 __mutex_trylock() || mutex_optimistic_spin() preempt_enable return add the task to wait_list Yes: midpath No: slow path Call path retval = 0 → cannot spin this owner owner->on_cpu = 0
  • 36. mutex_unlock __mutex_unlock_fast mutex_unlock(): Call path return mutex owner wait_lock osq wait_list task • task_struct pointers aligns to at least L1_CACHE_BYTES • 3 LSB bits are used for non-empty waiter list ✓ W (MUTEX_FLAG_WAITERS) ◼ Non-empty waiter list. Issue a wakeup when unlocking ✓ H (MUTEX_FLAG_HANDOFF) ◼ Unlock needs to hand the lock to the top-waiter ◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order. ✓ P (MUTEX_FLAG_PICKUP) ◼ Handoff has been done and we're waiting for pickup ◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order. locker->owner = 0 __mutex_unlock_slowpath Have waiters: One of 3-bit LSB of lock->owner is not cleared. spin_lock(&lock->wait_lock) spin_unlock(&lock->wait_lock) Get a waiter from lock->wait_list wake_up_q wake_up_process owner task virtual addr W P H 0 1 2 63 lock->owner __mutex_handoff Called if MUTEX_FLAG_HANDOFF is set atomic_long_cmpxchg_release( &lock->owner, owner, __owner_flags(owner)) The woken task will update lock->owner Set 3-bit LSB of lock->owner → Clear the original task struct address * ww_mutex (Wound/Wait Mutex): Deadlock-proof mutex [Unlock task = lock->owner] No waiter: 3-bit LSB of lock->owner are cleared
  • 37. mutex_unlock __mutex_unlock_fast mutex_unlock(): fast path return locker->owner = 0 [fast path] A spinner will take the lock [Unlock task = lock->owner] No waiter: 3-bit LSB of lock->owner are cleared
  • 38. mutex_unlock __mutex_unlock_fast mutex_unlock(): slow path return locker->owner = 0 __mutex_unlock_slowpath Have waiters: One of 3-bit LSB of lock->owner is not cleared. spin_lock(&lock->wait_lock) spin_unlock(&lock->wait_lock) Get a waiter from lock->wait_list wake_up_q wake_up_process __mutex_handoff Called if MUTEX_FLAG_HANDOFF is set owner task virtual addr 1 0 0 0 1 2 63 atomic_long_cmpxchg_release( &lock->owner, owner, __owner_flags(owner) owner 0 1 0 0 0 1 2 63 The woken task will update lock->owner mutex owner wait_lock osq wait_list task • task_struct pointers aligns to at least L1_CACHE_BYTES • 3 LSB bits are used for non-empty waiter list ✓ W (MUTEX_FLAG_WAITERS) ◼ Non-empty waiter list. Issue a wakeup when unlocking ✓ H (MUTEX_FLAG_HANDOFF) ◼ Unlock needs to hand the lock to the top-waiter ◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order. ✓ P (MUTEX_FLAG_PICKUP) ◼ Handoff has been done and we're waiting for pickup ◼ Use by ww_mutex because ww_mutex’s waiter list is not FIFO order. owner task virtual addr W P H 0 1 2 63 lock>-owner * ww_mutex (Wound/Wait Mutex): Deadlock-proof mutex [Unlock task = lock->owner] No waiter: 3-bit LSB of lock->owner are cleared
  • 39. mutex_unlock __mutex_unlock_fast [Unlock task = lock->owner] No waiter: 3-bit LSB of lock->owner are cleared return locker->owner = 0 __mutex_unlock_slowpath Have waiters: One of 3-bit LSB of lock-owner is not cleared. spin_lock(&lock->wait_lock) spin_unlock(&lock->wait_lock) Get a waiter from lock->wait_list wake_up_q wake_up_process __mutex_handoff Called if MUTEX_FLAG_HANDOFF is set owner task virtual addr 1 0 0 0 1 2 63 atomic_long_cmpxchg_release( &lock->owner, owner, __owner_flags(owner) owner 0 1 0 0 0 1 2 63 The woken task will update lock->owner mutex_unlock(): slow path
  • 40. Resume here: kthread_0 wakes up kthread_1 Woken task 1 Context switch 2
  • 43. * Bit 0 is still set (MUTEX_FLAG_WAITERS): The upcoming mutex_unlock() will wake up the waiter instead of spinner. * Bit 1 (MUTEX_FLAG_HANDOFF) is cleared from __mutex_trylock->__mutex_trylock_or_owner. Update lock->owner When/who clears 3-bit LSB of lock->owner?
  • 44. Woken task: When/who to clear 3-bit LSB of lock-owner? Clear 3-bit LSB of lock->owner if no waiters
  • 45. mutex_unlock __mutex_unlock_fast mutex_unlock(): Mutex ownership return locker->owner = 0 __mutex_unlock_slowpath Have waiters: One of 3-bit LSB of lock->owner is not cleared. spin_lock(&lock->wait_lock) spin_unlock(&lock->wait_lock) Get a waiter from lock->wait_list wake_up_q wake_up_process __mutex_handoff Called if MUTEX_FLAG_HANDOFF is set atomic_long_cmpxchg_release( &lock->owner, owner, __owner_flags(owner)) The woken task will update lock->owner Set 3-bit LSB of lock->owner → Clear the original task struct address [Unlock task = lock->owner] No waiter: 3-bit LSB of lock->owner are cleared • [Fastpath] Check ownership of a mutex • [Slowpath] Does not check ownership of a mutex
  • 46. mutex_unlock(): [lab] Behavior observation when lock owner != unlocker’s task (Mutex ownership) mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread Source code: test-modules/mutex-unlock-by-another-task/mutex.c This scenario is created on purpose for demonstration. It won’t happen in real case. Note
  • 47. mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread Can another task unlock the mutex? mutex_unlock(): [lab] Behavior observation when lock owner != unlocker’s task (Mutex ownership) This scenario is created on purpose for demonstration. It won’t happen in real case. Note
  • 48. __mutex_unlock_slowpath spin_lock(&lock->wait_lock) spin_unlock(&lock->wait_lock) Get a waiter from lock->wait_list wake_up_q wake_up_process __mutex_handoff atomic_long_cmpxchg_release( &lock->owner, owner, __owner_flags(owner)) mutex_unlock(): slow path does not check unlocker’s ownership [DEBUG_MUTEXES] Print a warning message if unlocker’s task != lock owner’s task
  • 49. breakpoint breakpoint 1 1 Lock owner 2 Unlocker’s task != lock owner 3 mutex_unlock(): [lab] Behavior when lock owner != unlocker’s task mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread We’re here
  • 50. breakpoint 1 breakpoint 1 lock owner = old 2 lock->owner is set 0 by atomic_long_cmpxchg_release() 3 mutex_unlock(): [lab] Behavior if lock owner != unlocker’s task mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread We’re here
  • 51. mutex_unlock(): [lab] Behavior observation when lock owner != unlocker’s task (Mutex ownership) mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread 1 2 3 This unlocks the locked mutex mutex owner = 0 wait_lock = 0 osq = 0 wait_list Breakpoint stops at second mutex_unlock() in kthread_0
  • 52. mutex_unlock(): [lab] Behavior observation when lock owner != unlocker’s task (Mutex ownership) mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread 1 2 3 This unlocks the locked mutex What’s the behavior?
  • 53. breakpoint 1 mutex_unlock(): [lab] Behavior when lock owner != unlocker’s task mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread breakpoint 1 No lock owner 2 We’re here
  • 54. breakpoint 1 mutex_unlock(): [lab] Behavior if lock owner != unlocker’s task mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread breakpoint 1 lock owner = old 2 lock->owner is set 0 by atomic_long_cmpxchg_release() 3 4 We’re here
  • 55. mutex_unlock(): [lab] Behavior observation when lock owner != unlocker’s task (Mutex ownership) mutex_unlock() mutex_lock() core 0 core 1 kthread_0 kthread_1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second lock_thread unlock_thread Takeaways 1. [Fastpath] Linux kernel checks mutex’s ownership 2. [Slowpath] Linux kernel does not check mutex’s ownership when unlocking a mutex ✓ Developers must take care of mutex_lock/mutex_unlock pair ✓ Slowpath prints a warning message if mutex debug option is enabled. ✓ Different from the concept: Only the lock owner has the permission to unlock the mutex
  • 56. mutex_unlock(): [lab] Behavior observation when lock owner != unlocker’s task (Mutex ownership) Why doesn’t slowpath check ownership? 1. Ownership checking is only for developers and not enforced by Linux kernel. ✓ Developers need to take care of it. Quotes 1. From Generic Mutex Subsystem ✓ Mutex Semantics • Only one task can hold the mutex at a time. • Only the owner can unlock the mutex. • … ✓ These semantics are fully enforced when CONFIG_DEBUG_MUTEXES is enabled
  • 57. Think about… mutex_unlock() mutex_lock() core 0 core 1 Critical Section sleep 3 seconds mutex_unlock() sleep 1 second mutex_unlock() mutex_lock() core 2 Critical Section 1 Will it acquire the mutex lock successfully? 2 Will this unlock the mutex lock acquired by core 2?
  • 58. Q&A #1: Semaphore can be used to synchronize with user-space Consumer buffer Producer up() down() [Semaphore] Producer/Consumer Concept * Screenshot captured from: Chapter 10, Linux Kernel Development, 3rd Edition, Robert Love • Consumer waits if buffer is empty • Producer waits if buffer is full • Only one process can manipulate the buffer at a time (mutual exclusion) Principle
  • 59. Q&A #1: Semaphore can be used to synchronize with user-space Consumer buffer Producer up(), V down(), P [Semaphore] Producer/Consumer Concept Code reference: CS 537 Notes, Section #6: Semaphores and Producer/Consumer Problem
  • 60. Q&A #1: Semaphore can be used to synchronize with user-space Consumer buffer Producer up(), V down(), P [Semaphore] Producer/Consumer Concept Data structure synchronization Notify consumer Wait if buffer is full Wait if buffer is empty Notify producer Code reference: CS 537 Notes, Section #6: Semaphores and Producer/Consumer Problem
  • 61. Q&A #1: Semaphore can be used to synchronize with user-space Consumer buffer Producer up(), V down(), P [Semaphore] Producer/Consumer Concept User Space Kernel Space . . . buffer Consumer Consumer Producer system call system call Possible Scenario Note 1. up()/down() invocations are done in kernel.
  • 62. Q&A #2: Mutex isn’t suitable for synchronizations between kernel and user-space * Screenshot captured from: Chapter 10, Linux Kernel Development, 3rd Edition, Robert Love Explanation 1. [Mutex] ownership! ✓ Whoever locked a mutex must unlock it
  • 63. Reference • Generic Mutex Subsystem • Wound/Wait Deadlock-Proof Mutex Design • Mutexes and Semaphores Demystified • MCS locks and qspinlocks • Linux中的mutex机制[一] - 加锁和osq lock
  • 65. __mutex_trylock signal_pending_state spin_unlock(&lock->wait_lock) spin_lock(&lock->wait_lock) for (;;) goto ‘acquired’ lable if getting the lock Interrupted by signal: break mutex_lock(): slowpath __mutex_add_waiter(lock, &waiter, &lock- >wait_list) waiter.task = current set_current_state(state) schedule_preempt_disabled __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF) __mutex_trylock() || (first && mutex_optimistic_spin()) spin_lock(&lock->wait_lock) N Y: break spin_lock(&lock->wait_lock) acquired __set_current_state(TASK_RUNNING) mutex_remove_waiter list_empty(&lock->wait_list) __mutex_clear_flag(lock, MUTEX_FLAGS) spin_lock(&lock->wait_lock) preempt_enable