Differential Approximation and Sprinting for Multi-Priority Big Data Engines

Differential Approximation and Sprinting
for Multi-Priority Big Data Engines
Robert Birke1, Isabelly Rocha2, Juan Perez3, Valerio Schiavoni2, Pascal Felber2, Lydia Y. Chen4
ABB Research, Switzerland1
University of Neuchâtel, Switzerland2
Universidad del Rosario, Colombia3
TU Delft, The Netherlands4
December 13th, 2019

Isabelly Rocha Differential Approximation and Sprinting for Multi-Priority Big Data Engines | Middleware 2019 2
Big Data Analytics FrameworksBig Data Analytics Applications
Context

Different requirements:
• Latency
• Accuracy
Context

Different requirements:
• Latency
• Accuracy
Solution:
• Priority scheduling
Context

Scheduling Queue
Priority 2
Priority 1
Big Data Analytics Framework
Priority Scheduling: Preemptive

Scheduling Queue
Priority 2
Priority 1
evict job!

Scheduling Queue
Priority 2
Priority 1
evict job!
waste

machine Dme distribuDon
Success execution Failed execution Priority eviction
20 %
35 %
45 %
• Consequences:
• Repetitive eviction of low priority jobs
• Significant latency degradation
• High amount of resources wasted on
subsequent evictions
Source: Demystifying Casualties of Evictions in Big Data Priority Scheduling.
ACM SIGMETRICS Performance Evaluation Review 42, 4 (2015), 12–21.

Scheduling Queue
Priority 2
Priority 1
Priority Scheduling: Non-Preemptive

Scheduling Queue
accuracy
loss
Priority 2
Priority 1

Scheduling Queue
accuracy
loss
extra
waiting
Priority 2
Priority 1

DiAS
Deflator
•
Dispatching
•
Monitoring
Dropper
• Defining ratios
Scheduler
Sprinter
• Frequency scaling
Our Approach: DiAS
• Spark as big data processing engine
• Augmented Spark with the capability
of drop tasks
• Prototype implemented in Golang
• Implemented a workload generator
for evaluation purpose

Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter
• Decide how much data to process base on
• Arrival time
• Size of inputs
• Current system load
• Latency model
• Given that certain inputs are dropped
• Define how the entire latency distribution will change

• Distributions:
• Phase-Type job processing times to model concurrent task times of jobs
• States:
• Matrix analytics methods, to track states
• Number of jobs in queue/engine
• Number of task queue/engine
• The job distribution across priorities
• Solver:
• MMAP[K]/PH[K]/1 priority queue from Horvath [EJOR’15]
Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter

0
10
20
30
40
50
60
70
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Meanabsolutepercenterror[%]
Θm
• Trade-off between accuracy and latency
• Low dropping ratios -> low accuracy loss
Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter

0
10
20
30
40
50
60
70
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Θm
Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter
Dropping ratio

0
10
20
30
40
50
60
70
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Θm
Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter
Dropping ratio
Dropping ratio: 50%
MAE: 42%

0 0.2 0.4 0.6 0.8
Drop Ratio
0
100
200
300
400
MeanJobResponseTime
model - high
obs - high
model - low
obs - low
• Validation of response time for 2-priority jobs
Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter

0 0.2 0.4 0.6 0.8
Drop Ratio
0
100
200
300
400
MeanJobResponseTime
model - high
obs - high
model - low
obs - low
• Validation of response time for 2-priority jobs
Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter
Dropping ratio: 20%
Response time: 152 seconds

Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter
Original MR job

Deflator DiAS
Deflator
Dropper
Scheduler
Sprinter
Original MR job Approximate MR job

Deflator
Dropper
Scheduler
Sprinter
DiAS
• Implemented in Spark by modifying the function findMissingPartitions()
• Modification: return only [n(1 − θk )] partitions out of n
• Follows specifications of
Dropper
Deflator

Deflator
Dropper
Scheduler
Sprinter
DiAS
Sprinter
• Performs CPU frequency scaling
• Handle sprinting timer for each job
• Budget defined from different constraints
• Thermal
• Power
• Provisioning

Our Approach: Overview DiAS
Deflator
Dropper
Scheduler
Sprinter

Isabelly Rocha Differential Approximation and Sprinting for Multi-Priority Big Data Engines | Middleware 2019
• Objetive
• Low waste
• Latency/accuracy requirements
• Multi-priority
14
Our Approach: Overview DiAS
Deflator
Dropper
Scheduler
Sprinter

• Objetive
• Low waste
• Multi-priority
14
Our Approach: Overview
• Challenges
• Many parameters
• Tail latency
• Priority
DiAS
Deflator
Dropper
Scheduler
Sprinter

• Objetive
• Low waste
• Multi-priority
• Goal
• Define: Task dropping ratio and sprinting timeout
• Given: Priority class, tolerance to accuracy degradation, available sprinting budget
14
Our Approach: Overview
• Challenges
• Many parameters
• Tail latency
• Priority
DiAS
Deflator
Dropper
Scheduler
Sprinter

• Spark processing engine
• Version 2.1
• 10 workers cluster
• 2 CPU cores and 4 GB memory per worker
• Intel Xeon E3-1270 v6 CPU, 64 cores and 128 GB memory
• Key parameters
• Ratio between low- and high priority jobs
• Average size
• Cluster load
• Workload
• Text analysis jobs
• Graph analysis jobs
Evaluation Setup

• Differential Approximation
• Two- and three priority system
• Sensitivity analisys
• Similar job size for all priorities
• Several high- to low-priority job ratio
• Several system load
• Differential Approximation and Sprinting
• Latency gain
• Energy gain
Evaluation

• Tail and average latency of low priority job decreases
• Average latency of high priority job increases
17
1
10
100
1000
P NP DA
0/10
DA
0/20
-80
-60
-40
-20
0
20
40
60
80
Responsetime[s]
Difference[%]
High
Low
Evaluation: Differential Approximation

17
1
10
100
1000
P NP DA
0/10
DA
0/20
-80
-60
-40
-20
0
20
40
60
80
Responsetime[s]
Difference[%]
High
Low
Preemptive

17
1
10
100
1000
P NP DA
0/10
DA
0/20
-80
-60
-40
-20
0
20
40
60
80
Responsetime[s]
Difference[%]
High
Low
Preemptive
Non-Preemptive

17
1
10
100
1000
P NP DA
0/10
DA
0/20
-80
-60
-40
-20
0
20
40
60
80
Responsetime[s]
Difference[%]
High
Low
Preemptive
Non-Preemptive
Differential Approximation
x/y: drop ratio on high- (x)
and low-priority (y)

1
10
100
1000
10000
P DiAS
0/10
DiAS
0/20
-80
-60
-40
-20
0
20
40
60
80
Responsetime[s]
Difference[%]
High
Low
• Tail and average latency of low priority job decreases even more (up to 90%)
• Average latency of high priority also decreases
Evaluation: Differential Approximation and Sprinting

• Reduction of more than 20% on energy consumption
19
1
10
100
1000
10000
P DiAS
0/10
DiAS
0/20
-80
-60
-40
-20
0
20
40
60
80
Energy[kJ]
Difference[%]
Unimited
Limited
Evaluation: Differential Approximation and Sprinting

• Main goal: reduce resource waste
• Strategy:
• Drop job eviction
• Deflate low-priority jobs
• Sprint high-priority jobs
• Additional gains:
• Up to 90% latency reduction
• More than 20% energy savings
• Implemented in Golang on top of Spark
Summary and Takeaways

Differential Approximation and Sprinting for Multi-Priority Big Data Engines

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (11)

Similaire à Differential Approximation and Sprinting for Multi-Priority Big Data Engines

Similaire à Differential Approximation and Sprinting for Multi-Priority Big Data Engines (20)

Plus de LEGATO project

Plus de LEGATO project (20)

Dernier

Dernier (20)

Differential Approximation and Sprinting for Multi-Priority Big Data Engines