Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4

Till Rohrmann
till@data-artisans.com
@stsffap
From Apache Flink®
1.3 to 1.4

2
Original creators of Apache
Flink®
Providers of
dA Platform 2, including
open source Apache Flink +
dA Application Manager

Overview
Apache Flink 1.3 – Previously on Apache
Flink
Apache Flink 1.4 – What’s happening now?
Apache Flink 1.5+ – Next on Apache Flink
3

Previously on Apache Flink
Apache Flink 1.3

Apache Flink 1.3 in Numbers
141 contributors (no deduplication)
1400 commits
>= 680 resolved JIRA issues
+261813 / -65646 LOC
7

Evolution of Flink’s API
8
Flink 1.0.0
State API (ValueState
ReducingState, ListState)
Flink 1.1.0
Session Windows
Late arriving events
Flink 1.2.0
ProcessFunction (access
to state, timers, events)
Flink 1.3.0
Side outputs
Access to per-window state

Side Outputs
 Additional outputs for a stream
 Late events
 Corrupted input data
 More expressive APIs
 FLINK-4460
9
Process
Function
Main output
Side output

Evolution of Large State Handling
11
Flink 1.0.0
RocksDB for out-of-core
state support
Flink 1.1.0
Fully async RocksDB
snapshots
Flink 1.2.0
Rescalable keyed and
non-partitioned state
Flink 1.3.0
Incremental checkpoints
Fine-grained recovery

G
H
C
D
Full Checkpoints
12
Checkpoint 1 Checkpoint 2 Checkpoint 3
I
E
A
B
C
D
A
B
C
D
A
F
C
D
E
@t1 @t2 @t3
A
F
C
D
E
G
H
C
D
I
E

G
H
C
D
Incremental Checkpoints
13
Checkpoint 1 Checkpoint 2 Checkpoint 3
I
E
A
B
C
D
A
B
C
D
A
F
C
D
E
E
F
G
H
I
@t1 @t2 @t3

Incremental Checkpoints
14
Checkpoint 1 Checkpoint 2 Checkpoint 3 Checkpoint 4
C1 C3C1 C1
Chunk
1
Chunk
2
Chunk
3
Chunk
4
Storage
C2 C4C3

Incremental Checkpointing Contd.
Currently supported for RocksDB
state backend
FLINK-5053
Faster and smaller checkpoints
15
Full checkpoint Incremental checkpoint
Size 60 GB 1 – 30 GB
Time 180 s 3 – 30 s
“A Look at Flink’s Internal
Data Structures and
Algorithms for Efficient
Checkpointing” by Stefan
Richter, Tomorrow @
12:20 pm Maschinenhaus

Evolution of High Level APIs
16
Flink 1.0.0
CEP library added
Table API v1
Flink 1.1.0
Table API overhaul
Integration with Apache Calcite
Flink 1.2.0
Tumbling, sliding and session
group-windows for Table API
Flink 1.3.0
Rescalable CEP operators
Retractions in Table API/SQL

Enriched CEP Language
Support for quantifiers (+, *, ?)
FLINK-3318
Iterative conditions
FLINK-6197
Not operator
FLINK-3320
17
“Complex Event Processing With
Flink: The State of FlinkCEP” by
Kostas Kloudas, Today @ 2:30
pm Maschinenhaus

What’s Happening Now?
Apache Flink 1.4

Event Driven I/O
23
Rework of Flink’s network stack
Event driven network I/O
Use full available capacity
Near perfect latency behaviour
TCP
Buffer
capacity left
flush

Flow Control
 Flow control for TaskManager communication
 Single channel no longer stalls other
multiplexed channels
 Fine-grained backpressure control
 Improves checkpoint alignments
24
“Building a Network Stack
for Optimal Throughput /
Low-Latency Trade-Offs”
by Nico Kruber, Today @
2:00 pm Palais Atelier
Receiver
Sender #1
Sender #2
Give credit
Send
credited data

New Deployment Model
Rework of Flink’s distributed
architecture
Ready for multitude of
deployment scenarios
Support for dynamic scaling
25
“Flink in Containerland” by
Patrick Lucas, Tomorrow
@ 3:20 pm Maschinenhaus

Producing Exactly Once with Kafka 0.11
Support for Kafka 0.11
First Kafka producer with
exactly once processing
guarantees
26
“Hit Me, Baby, Just One Time
– Building End-to-End Exactly
Once Applications With Flink”
by Piotr Nowojski, Today @
3:20 pm Palais Atelier
Consuming Producing
End-to-End exactly once processing

Operational Robustness
Drop Java 7
Support Scala 2.12
Avoid dependency hell
Child first class loading
Relocation of
dependencies
De-Hadoopification
28

Next on Apache Flink
Apache Flink 1.5+

Side Inputs
 Additional input for operator
 Join with static data set
 Feeding of externally trained ML model
 Window joins
 Flip-17 design document: https://goo.gl/W4yMEu
30
Process
Function
Main input
Side input

State Management & Evolution
Eager state declaration
State type, serializer and name
known at pre-flight time
Flip-22 design document:
https://goo.gl/trFiSi
Evolving existing state
Schema updates
Serializer upgrades
31
“Managing State in
Apache Flink” by
Tzu-Li Tai, Today @
4:30 pm Kesselhaus

State Replication
Replicate state between
TaskManagers
Faster recovery in
case of failures
High throughput
queryable state
32
TaskManager
TaskManager
Change log stream
Input
State

Programmatic Job Control
Improve client to give better job control
Run concurrent jobs from the same
program
Trigger savepoints programmatically
Better testing facilities
33

JobClient & ClusterClient
34
StreamExecutionEnvironment env = ...;
// define program
JobClient jobClient = env.execute();
CompletableFuture<Acknowledge> savepointFuture = jobClient.takeSavepoint(savepointPath);
// wait for the savepoint completion
savepointFuture.get();
CompletableFuture<JobExecutionResult> resultFuture = jobClient.getResultFuture();
// cancel the job
jobClient.cancelJob();
// get the execution result --> should be canceled
JobExecutionResult result = resultFuture.get();
// get list of all still running jobs on the cluster
ClusterClient clusterClient = jobClient.getClusterClient();
CompletableFuture<List<JobInfo>> jobInfosFuture = clusterClient.getJobInfos();
List<JobInfo> jobInfos = jobInfosFuture.get();

TL;DL
Apache Flink one of the most innovative open
source stream processing platforms
Stay tuned what’s happening next 
Visit the in depths talks to learn more about
Flink’s internals
36

37
Thank you!
@stsffap
@ApacheFlink
@dataArtisans

We are hiring!
data-artisans.com/careers
38

Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4

Similaire à Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4 (20)

Plus de Flink Forward

Plus de Flink Forward (20)

Dernier

Dernier (20)

Flink Forward Berlin 2017: Till Rohrmann - From Apache Flink 1.3 to 1.4

Notes de l'éditeur