Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workloads with Chris Stevens

Meltdown, Spectre and
Apache Spark™
Performance
Chris Stevens
June 5, 2018
1

Databricks Performance on AWS
2
3-5% performance degradation

Overview
Goal: Understand the 3-5% degradation
Steps:
- TPC-DS Benchmarks
- System Analysis
- Breakdown the exploits and patches
- Meltdown
- Spectre V1 - Bounds Check Bypass
- Specter V2 - Branch Target Injection
3

TPC-DS: q1-v2.4
4
WITH customer_total_return AS
(SELECT sr_customer_sk AS ctr_customer_sk,
sr_store_sk AS ctr_store_sk,
sum(sr_return_amt) AS ctr_total_return
FROM store_returns, date_dim
WHERE sr_returned_date_sk = d_date_sk
AND d_year = 2000
GROUP BY sr_customer_sk, sr_store_sk)
SELECT c_customer_id
FROM customer_total_return ctr1, store, customer
WHERE ctr1.ctr_total_return >
(SELECT avg(ctr_total_return)*1.2
FROM customer_total_return ctr2
WHERE ctr1.ctr_store_sk = ctr2.ctr_store_sk)
AND s_store_sk = ctr1.ctr_store_sk
AND s_state = 'TN'
AND ctr1.ctr_customer_sk = c_customer_sk
ORDER BY c_customer_id LIMIT 100

q1-v2.4 CPU Utilization
7mpstat -P ALL

TPC-DS: q1-v2.4
8strace -f -c -p PID

Exploits Background
Out-of-Order Execution + Side Channel Attacks
9

In-order Execution
x = a + b
y = c + d
z = x + y
Add
A + B
Store
X
Decode
Add
Retire
Add
C + B
Store
Y
Decode
Add
Retire
Add
X + Y
Store
Z
Decode
Add
Retire
Time
10

Out-of-order Execution
x = a + b
y = c + d
z = x + y
Add
A + B
Store
X
Decode
Add
Retire
Add
C + D
Store
Y
Decode
Add
Retire
Add
X+Y
Store
Z
Decode
Add
Retire
Time
11

Side-Channel Attacks
FLUSH +RELOAD
0 0xABAB
1 0xCDCD
2 0xEFEF
3 0x1212
0 INVALID
1 INVALID
2 INVALID
3 INVALID
0 INVALID
1 INVALID
2 0xEFEF
3 INVALID
Speculative execution
fills cache entry 2
0 0xABAB
1 0xCDCD
2 0xEFEF
3 0x1212
ATTACK
clflush instruction
invalidates the cache
Reload measures the time
to read each cache entry
12

Side-Channel Attacks
https://meltdownattack.com/meltdown.pdf
13

Meltdown
kernelByte = *kernelAddr;
probeArray[kernelByte * 4096];
15

Meltdown
Get value of
kernelAddr
Get value of
probeArray
Read 1 Byte
at kernelAddr
Retire
Multiply
kernel byte
by 4096
Read
probeArray
at offset
Time
Decode
Multiply
Decode
Array Access
Decode
Memory
Read
Retire
Retire
Permission
Check Here
Side Channel
Attack 16

Side-Channel Attack
FLUSH +RELOAD
0*4096 INVALID
1*4096 INVALID
2*4096 INVALID
3*4096 INVALID
Speculative execution
reads probeArray at
kernelByte * 4096
ATTACK
Flush each page in
probeArray
Reload measures the time
to read each page in
probeArray
0*4096 INVALID
1*4096 INVALID
2*4096 0xEFEF
3*4096 INVALID
0*4096 0xABAB
1*4096 0xCDCD
2*4096 0xEFEF
3*4096 0x1212
Sees that page 2
was the fastest =>
kernelByte = 0x2
17

Kernel Page-Table Isolation
Process Page Tables without KPTI
Combined
Page Tables
Kernel
Virtual Memory
User
Virtual Memory
Sensitive Data
Meltdown
Code
Process Page Tables with KPTI
Kernel
Page Tables
Kernel
Virtual Memory
User
Virtual Memory
Sensitive Data
Meltdown
Code
Meltdown
Code
User Mode
Page Tables
Same Physical Memory 18

Meltdown
Get value of
kernelAddr
Get value of
probeArray
Read 1 Byte
at kernelAddr
Retire
Multiply
kernel byte
by 4096
Read
probeArray
at offset
Time
Decode
Multiply
Decode
Array Access
Decode
Memory
Read
Retire
Retire
19

Kernel Page-Table Isolation
Process Page Tables without KPTI
Combined
Page Tables
Kernel
Virtual Memory
User
Virtual Memory
Sensitive Data
Meltdown
Code
Process Page Tables with KPTI
Kernel
Page Tables
Kernel
Virtual Memory
User
Virtual Memory
Sensitive Data
Meltdown
Code
Meltdown
Code
User Mode
Page Tables
Same Physical Memory 20

TLB before KPTI
Virtual
Address
Physical
Address
- -
- -
while True:
print *UserAddress
sys_clock_gettime
Virtual
Address
Physical
Address
UserAddress Page 1
- -
Virtual
Address
Physical
Address
UserAddress Page 1
KernelTime Page 2
print *UserAddress print *UserAddresssys_clock_gettime
TLB MISS TLB MISS TLB HIT
21

TLB with KPTI
Virtual
Address
Physical
Address
- -
- -
while True:
print *UserAddress
sys_clock_gettime
Virtual
Address
Physical
Address
- -
- -
Virtual
Address
Physical
Address
- -
- -
TLB MISS TLB MISS TLB MISS
22

Meltdown TLB Misses
23perf stat -e dtlb-load-misses

TLB with KPTI and PCID
Virtual
Address
PCID Physical
Address
- - -
- - -
while True:
print *UserAddress
sys_clock_gettime
TLB MISS TLB MISS TLB HIT
Virtual
Address
PCID Physical
Address
UserAddress 1 Page 1
- - -
Virtual
Address
PCID Physical
Address
UserAddress 1 Page 1
KernelTime 0 Page 2

Spectre V1
Bounds Check Bypass
26

Bounds Check Bypass
if (index < size) {
val = array[index];
probeArray[val];
}
27

Bounds Check Bypass
28
if (index < size) {
val = array[index];
probeArray[val];
}
Read size Is (index < size)? Retire
Read array
at index
Time
Was
(index < size)?
Decode If
Is
(index < size>)?
Yes
Read
probeArray
Retire
Retire
Is
(index < size>)?
Yes
Yes
Side Channel Attack

Bounds Check Bypass
29
if (index < size) {
val = array[index];
probeArray[val];
} ATTACKTRAIN
Was (index < size)?
-
Was (index < size)?
yes
Code
size = 10
index = 1000
Code
size = 10
index = 5

Observable Speculation Barrier
30
- Protects against Spectre V1 - Bounds Check Bypass
- ~14 in the Ubuntu kernel and drivers
- Stops speculative array access with the LFENCE barrier
if (index < size) {
val = array[index];
probeArray[val];
}
if (index < size) {
osb();
val = array[index];
probeArray[val];
}
Before After

Observable Speculation Barrier
31
if (index < size) {
osb()
val = array[index];
probeArray[val];
}
Read size Is (index < size)? Retire
LFENCE
Time
Was
(index < size)?
Decode If
Is
(index < size>)?
Yes
Retire
Yes
Read array
at index

Spectre V2
Branch Target Injection
32

33
def call_func(func, arg):
func(arg)

34
func(arg)
Read func
from memory
Call func Retire
Call previous
func
Time
What was func
last time?
Decode call
func ==
previous func
Retire
func ==
previous func
No
Yes

35
func(arg)
ATTACKTRAIN
What was func?
-
What was func?
attack_widget
call_func(attack_widget, 1000)
innocent, 10
call_func(innocent, 10)
attack_widget, 1000

Intel Microcode Updates
36
- Protect against Spectre V2 - Branch Target Injection
- Indirect Branch Restricted Speculation (IBRS)
- Stops attacks from code running at lower privilege levels
- Stops attacks from code running on the sibling hyperthread (STIBP)
- Indirect Branch Prediction Barrier (IBPB)
- Stops attacks from code running at the same privilege level
- Inserted at User-to-User and Guest-to-Guest transitions

IBPB and IBRS Patches
37
func(arg)
ATTACKTRAIN
What was func?
-
What was func?
-
call_func(attack_widget, 1000)
innocent, 10
call_func(innocent, 10)
attack_widget, 1000
IBPB, IBRS

38
func(arg)
Read func
from memory
Call func Retire
Cache Miss
Time
What was func
last time?
Decode call
func ==
previous func
No

Spectre V2 Branch Misses
39perf stat -e branch-misses

Retpoline
- Protects indirect branches/calls from speculative execution
- Uses an infinite loop to do it
41
jmp rax call set_up_target
speculative_loop:
pause
lfence
jmp speculative_loop
set_up_target:
mov rax, [rsp]
ret
Before After Return Stack Buffer
.
.
.
speculative_loop
Stack
.
.
.
speculative_loop

Retpoline
42
speculative_loop:
pause
lfence
set_up_target:
mov rax, [rsp]
ret
.
.
.
speculative_loop
Stack
.
.
.
rax

Retpoline
43
speculative_loop:
pause
lfence
set_up_target:
mov rax, [rsp]
ret
.
.
.
speculative_loop
Stack
.
.
.
rax

Retpoline
44
speculative_loop:
pause
lfence
set_up_target:
mov rax, [rsp]
ret
.
.
.
speculative_loop
Stack
.
.
.
rax

Retpoline
45
speculative_loop:
pause
lfence
set_up_target:
mov rax, [rsp]
ret
.
.
.
speculative_loop
Stack
.
.
.
rax

Retpoline (Improved)
- Check against a known target and direct branch
46
jmp rax cmp rax, known_target
jne retpoline
jmp known_target
retpoline:
call set_up_target
speculative_loop:
pause
set_up_target:
mov rax, [rsp]
ret
Before After
Direct Branch (fast)

Retpoline CPU Utilization
48mpstat -P ALL

Retpoline System Calls
49strace -f -c -p PID

A Tale of Two Queries
50
q1-v2.4 ss_max-v2.4
SELECT
count(*) AS total,
count(ss_sold_date_sk) AS not_null_total,
count(DISTINCT ss_sold_date_sk) AS unique_days,
max(ss_sold_date_sk) AS max_ss_sold_date_sk,
max(ss_sold_time_sk) AS max_ss_sold_time_sk,
max(ss_item_sk) AS max_ss_item_sk,
max(ss_customer_sk) AS max_ss_customer_sk,
max(ss_cdemo_sk) AS max_ss_cdemo_sk,
max(ss_hdemo_sk) AS max_ss_hdemo_sk,
max(ss_addr_sk) AS max_ss_addr_sk,
max(ss_store_sk) AS max_ss_store_sk,
max(ss_promo_sk) AS max_ss_promo_sk
FROM store_sales
WITH customer_total_return AS
(SELECT sr_customer_sk AS ctr_customer_sk,
sr_store_sk AS ctr_store_sk,
sum(sr_return_amt) AS ctr_total_return
FROM store_returns, date_dim
WHERE sr_returned_date_sk = d_date_sk
AND d_year = 2000
GROUP BY sr_customer_sk, sr_store_sk)
SELECT c_customer_id
FROM customer_total_return ctr1, store, customer
WHERE ctr1.ctr_total_return >
(SELECT avg(ctr_total_return)*1.2
FROM customer_total_return ctr2
WHERE ctr1.ctr_store_sk = ctr2.ctr_store_sk)
AND s_store_sk = ctr1.ctr_store_sk
AND s_state = 'TN'
AND ctr1.ctr_customer_sk = c_customer_sk
ORDER BY c_customer_id LIMIT 100

q1-v2.4 vs ss_max-v2.4
51perf stat -e raw_syscalls:sys_enter -a -I 1000

Thank You
chris.stevens@databricks.com
https://www.linkedin.com/in/chriscstevens/
52

array_index_nospec()
54
- Protects against Spectre V1 - Bounds Check Bypass
- ~50 in the Linux kernel and drivers (not in Ubuntu)
- Stops speculative array access by clamping the index
if (index < size) {
val = array[index];
probeArray[val];
}
if (index < size) {
index = array_index_nospec(index, size);
val = array[index];
probeArray[val];
}
Before After

Speculative Store Bypass
55
- CVE-2018-3639
- “Variant 4”
- Public Date: May 21, 2018
- Details:
- https://access.redhat.com/security/vulnerabilities/ssbd
- Ubuntu Command Line Parameter:
- spec_store_bypass_disable=on

TPC-DS: q1-v2.4
57
Baselines Pre-Skylake 2% Skylake+ 11%

TPC-DS: ss_max-v2.4
58
Baselines Pre-Skylake 0% Skylake+ 1%

ss_max-v2.4 Runtime Box Plots
60

Single Node TPC-DS Setup
• Intel Core i7-4820k @ 3.70GHz (Ivy Bridge - ca. 2012)
• 8 CPUs
• 8GB of RAM
• Disk: Western Digital Blue
– WDC10EZEX-08M2NA0
– 1TB
– 7200 rpm
– 146MB/s sequential read (http://hdd.userbenchmark.com/)
• Ubuntu 16.04.4 LTS (Xenial Xerus)
– 64-bit server image
– 4.4.0-116-generic Linux kernel
61

Repositories
- https://github.com/apache/spark.git
- https://github.com/databricks/spark-sql-perf.git
- https://github.com/databricks/tpcds-kit
- https://github.com/speed47/spectre-meltdown-checker
- https://github.com/brendangregg/pmc-cloud-tools
- https://github.com/brendangregg/perf-tools
62

Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workloads with Chris Stevens

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workloads with Chris Stevens

Similaire à Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workloads with Chris Stevens (20)

Plus de Databricks

Plus de Databricks (20)

Dernier

Dernier (20)

Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workloads with Chris Stevens