SlideShare une entreprise Scribd logo
1  sur  101
Télécharger pour lire hors ligne
When Streaming
Needs Batch
-
Flink's Journey
Towards a
Unified Engine
Konstantin Knauf
–
@snntrable
–
Current 22
About Me
2
● Co-Founder @ Immerok
● Apache Flink Committer & Member of the PMC
● Formerly Head of Product & Field Engineer @ Ververica
Visit us at
booth S14.
Steady State - Streaming Heaven
3
4
Backfilling
following an outage
Backfilling
or after an introducing a bug
7
State Bootstrapping
Catch Up
Bursty Streams
Agenda
Motivation CHECK
Setting the Stage
Catching Up & Handling Bursty Streams
Backfilling
Bootstrapping
Takeaways
Setting the Stage
11
Stream Processing Batch Processing
Bounded Data CHECK CHECK
Unbounded Data CHECK
Some Terminology
Nature
of
Data
Nature of Processing
A Typical Streaming Job
Apache Kafka* in/out
* or any other replayable queue like
Apache Pulsar/AWS Kinesis/…
A Typical Streaming Job
Flink Pipeline
with multiple operators &
shuffles
A Typical Streaming Job
incl. significant state
A Typical Streaming Job
A Typical Streaming Job
A Typical Streaming Job
A Typical Streaming Job
Catching Up &
Bursty Streams
20
Scenario
There is a large backlog of data and you want to catch up to real-time again. For example,
happens when upstream producers send data in big chunks.
Wishlist
● process backlog quickly
● process backlog robustly with existing resources
Scenario
1
Backlog -> Backpressure
When under backpressure, what do you want?
● Scale up
● Or catch up steadily
Backlog -> Backpressure
When under backpressure, what do you want?
● Scale up
● Or catch up steadily
Scaling Up under Backpressure
Adaptive Scheduler and Reactive Mode (Flink 1.13)
Task Manager
Task
Slot
Task
Slot
Scaling Up under Backpressure
Adaptive Scheduler and Reactive Mode (Flink 1.13)
Task Manager
Task
Slot
Task
Slot
Task Manager
Task
Slot
Task
Slot
Scaling Up under Backpressure
Adaptive Scheduler and Reactive Mode (Flink 1.13)
Task Manager
Task
Slot
Task
Slot
Task Manager
Task
Slot
Task
Slot
● Job automatically scales up when provided with additional resources
● No additional Savepoint needed for rescaling*
* Caveat: This might still take quite some time during restore. Flink 1.16 brings some more improvements in that regard.
Backlog -> Backpressure
When under backpressure, what do you want?
● Scale up
● Or catch up steadily
Robustness under Backpressure
Watermark Alignment (Flink 1.15)
high throughput topic
low throughput topic
Robustness under Backpressure
Watermark Alignment (Flink 1.15)
advances watermark
~ with processing
time
advances watermark
~ with processing
time
Robustness under Backpressure
Watermark Alignment (Flink 1.15)
advances watermark
slowly
advances watermark
very quickly
Robustness under Backpressure
Watermark Alignment (Flink 1.15)
join state grows
during processing a
backlog
Robustness under Backpressure
Watermark Alignment (Flink 1.15)
advances watermark
slowly
advances watermark very
quickly
Robustness under Backpressure
Unaligned Checkpoints (Flink 1.11-1.13) & Buffer Debloating (Flink 1.14)
checkpoint
barrier n
6
x
5
4
3
2
1
f
e
c
d
b
a
h
g
e
f
d
c
y
9
8
7
6
5
4
a y
b
1
2
3
input buffer
aligning
begin aligning
operator operator
Robustness under Backpressure
Unaligned Checkpoints (Flink 1.11-1.13) & Buffer Debloating (Flink 1.14)
Under backpressure & at scale checkpoint alignment can take hours leading to checkpoint timeouts and job
failures.
● Buffer debloating dramatically reduces the amount of in-flight data
● Unaligned checkpoints allows barriers to overtake in-flight data
checkpoint
barrier n
6
x
5
4
3
2
1
f
e
c
d
b
a
h
g
e
f
d
c
y
9
8
7
6
5
4
a y
b
1
2
3
input buffer
aligning
begin aligning
operator operator
Scenario
There is a large backlog of data and you want to catch up to real-time again. For example, happens
when upstream producers send data in big chunks.
Wishlist
● process backlog quickly CHECK (Adaptive Scheduler)
● process backlog robustly with existing resources CHECK (Buffer Debloating, Watermark Alignment, Unaligned Checkpoints)
Scenario
Backfilling
36
Scenario
You want to reprocess a fixed amount of (recent) historical data to correct a bug or outage.
Wishlist
● Code-reuse for backfilling
● Same semantics and complete & correct results
● Resource efficient
Scenario
[1] https://www.youtube.com/watch?v=4qSlsYogALo&t=668s
1
38
- The Apache Flink Community
Batch is a Special Case of Streaming
DataStream API
with Streaming Execution
DataStream API
with Streaming Execution
DataStream API
with Streaming Execution
All the elasticity and robustness improvements for processing under backpressure apply to here, too.
Scenario
We want to reprocess a fixed amount of (recent) historical data to correct a bug or outage.
Wishlist
● Code-reuse for backfilling CHECK
● Same semantics and complete & correct results CHECK
● Resource efficient
Evaluation
Stream Execution Mode
Batch Execution Mode
Implementation Timeline
● Apache Flink 1.12
○ Initial Release
○ Unified Sink API (beta)
● Apache Flink 1.13
○ Support for Python DataStream API
● Apache Flink 1.14
○ Batch Execution Mode for mixed DataStream/Table API programs
○ Unified Sink API stable
○ Unified Source API stable
● Apache Flink 1.15
○ Most Sources/Sinks migrated to unified interfaces
○ Adaptive Batch Scheduler
DataStream API
with Batch Execution
DataStream API
with Batch Execution
DataStream API
with Batch Execution
Batch Execution Performance
Why is Batch Processing faster?
It all boils down to completeness & latency.
Stream Processing Batch Processing
Data is incomplete Data is complete
Latency SLAs No Latency SLAs
Batch Execution vs Stream Execution
Batch Execution vs Stream Execution
Data Exchange Mode
Batch Execution vs Stream Execution
Data Exchange Mode
Batch Execution vs Stream Execution
Data Exchange Mode
Batch Execution vs Stream Execution
Data Exchange Mode
Batch Execution vs Stream Execution
Data Exchange Mode
Batch Execution vs Stream Execution
Fault Tolerance
Stream Processing
Batch Execution vs Stream Execution
Fault Tolerance
Object Store (S3, GCS, HDFS, …)
Periodic Snapshots
Stream Processing
Checkpointing
Batch Execution vs Stream Execution
Fault Tolerance
Object Store (S3, GCS, HDFS, …)
Periodic Snapshots
Stream Processing
Checkpointing
Batch Processing
Batch Execution vs Stream Execution
Fault Tolerance
Object Store (S3, GCS, HDFS, …)
Periodic Snapshots
Stream Processing
Checkpointing
Batch Processing
Batch Execution vs Stream Execution
Fault Tolerance
Object Store (S3, GCS, HDFS, …)
Periodic Snapshots
Stream Processing
Checkpointing
Batch Processing
Batch Execution vs Stream Execution
Fault Tolerance
Object Store (S3, GCS, HDFS, …)
Periodic Snapshots
Stream Processing
Checkpointing
Batch Processing
Backtracking
Local Disk or
External Shuffle Service
Batch Execution vs Stream Execution
Processing Order and State Backends
0
0
Stream Processing
Batch Execution vs Stream Execution
Processing Order and State
0
1
Stream Processing
Batch Execution vs Stream Execution
Processing Order and State
1
1
Stream Processing
Batch Execution vs Stream Execution
Processing Order and State
1
2
Stream Processing
Batch Execution vs Stream Execution
Processing Order and State
2
2
Stream Processing
Batch Execution vs Stream Execution
Processing Order and State
Stream Processing
Keys are processed simultaneously.
2
2
Batch Execution vs Stream Execution
Processing Order and State
2
2
Batch Processing
Stream Processing
Keys are processed simultaneously.
Batch Execution vs Stream Execution
Processing Order and State
2
5
Stream Processing
Keys are processed simultaneously.
Batch Processing
Batch Execution vs Stream Execution
Processing Order and State
2
2
5
Stream Processing
Keys are processed simultaneously.
Batch Processing
Batch Execution vs Stream Execution
Processing Order and State
2
2
6
5
Stream Processing
Keys are processed simultaneously.
Batch Processing
Batch Execution vs Stream Execution
Processing Order and State
2
2
5
6
Stream Processing
Keys are processed simultaneously.
Batch Processing
Batch Execution vs Stream Execution
Processing Order and State
2
2
5
6
Stream Processing
Keys are processed simultaneously.
Batch Processing
Keys are processed one after another.
Batch Execution vs Stream Execution
Time
● Does Processing Time make sense when processing historical data?
○ Not really.
○ All processing time timers fire at the end of the input.
Batch Execution vs Stream Execution
Time
● Does Processing Time make sense when processing historical data?
○ Not really.
○ All processing time timers fire at the end of the input.
● Does historical data arrive out-of-order?
○ No, as it is complete we can sort it by timestamp if needed.
Batch Execution vs Stream Execution
Time
● Does Processing Time make sense when processing historical data?
○ Not really.
○ All processing time timers fire at the end of the input.
● Does historical data arrive out-of-order?
○ No, as it is complete we can sort it by timestamp if needed.
● Do watermarks make sense in batch processing?
○ No, we don’t need them. There is no trade off between latency and completeness.
○ Watermark jumps from -∞ to +∞. All event time timers fire at the end of the input.
Batch Execution vs Stream Execution
Summary
Stream Processing Batch Processing
Data Exchange Mode Pipelined Blocking
Fault Tolerance Checkpointing Backtracking
Processing Order All keys simultaneously Keys one by one
Time
● Event processed out-of-order
● Event and Processing Time
● Watermarks
● Events processed by event time for each key
● Only Eventtime
● No Watermarks
Scenario
We want to reprocess a fixed amount of (recent) historical data to correct a bug or outage.
Wishlist
● Code-reuse for backfilling CHECK
● Same semantics and complete & correct results CHECK
● Resource efficient CHECK (Potential Caveat: Resource Consumption? See Uber Talk.)
Evaluation
Batch Execution Mode
Bootstrapping
79
Scenario
We want to process historical data (weeks, months, year) to build up the applications state
before switching the application to real-time data.
Wishlist
● Code-reuse for bootstrapping
● Different data source for bootstrapping
● Resource efficient
Scenario
[1] https://www.youtube.com/watch?v=BTWntKy_MJs
[2] https://www.youtube.com/watch?v=JQyfXEQqKeg
[3] https://www.youtube.com/watch?v=JKndMiXphzw
1
2
3
Bootstrapping with the Hybrid Source
Hybrid Source automates switching of sources from historical data to real-time
data within a single streaming Job.
S3
Kafka
hours retention
years retention
S3
Bootstrapping with the Hybrid Source
Hybrid Source automates switching of sources from historical data to real-time
data within a single streaming Job.
S3
Kafka
hours retention
years retention
Unbounded Source
Bounded Source
S3
Bootstrapping with the Hybrid Source
Hybrid Source automates switching of sources from historical data to real-time
data within a single streaming Job.
S3
Kafka
hours retention
years retention
S3
All the elasticity and robustness improvements for processing under backpressure apply to here, too.
Bootstrapping with the Hybrid Source
Hybrid Source automates switching of sources from historical data to real-time
data within a single streaming Job.
S3
Kafka
hours retention
years retention
End of Input Reached
S3
Bootstrapping with the Hybrid Source
Hybrid Source automates switching of sources from historical data to real-time
data within a single streaming Job.
S3
Kafka
hours retention
years retention
S3
Bootstrapping with the Hybrid Source
Bootstrapping with the Hybrid Source
Scenario
We want to process historical data (weeks, months, year) to build up the applications state
before switching the application to real-time data.
Wishlist
● Code-reuse for bootstrapping 🗸
● Different data source for bootstrapping 🗸
● Resource efficient
Evaluation
Bootstrapping with Hybrid Source
Bootstrapping with Batch Execution
/dev/null
Bootstrapping Job
With Batch Execution
Separate Data Source Discarding Sink
S3
Bootstrapping with Batch Execution
Savepoint
/dev/null
Bootstrapping Job
With Batch Execution
Separate Data Source Discarding Sink
produces a final Savepoint
S3
Bootstrapping with Batch Execution
Savepoint
/dev/null
Bootstrapping Job
With Batch Execution
Separate Data Source Discarding Sink
Real-Time Job
With Stream Execution
produces a final Savepoint takes a final Savepoint as initial state
S3
Pre-Release!
Demo
Demo
Final Savepoints for Batch Jobs
Next Steps
1. Still some limitations & open questions to address in prototype
2. Publish FLIP & discuss with the Community
3. We are optimistic about Flink 1.17.
Scenario
We want to process historical data (weeks, months, year) to build up the applications state
before switching the application to real-time data.
Wishlist
● Code-reuse for bootstrapping CHECK
● Different data source for bootstrapping CHECK
● Resource efficient CHECK
Evaluation
Bootstrapping with Batch Execution
Takeaways
96
Takeaways
● Just because you are streaming, doesn’t mean you can always avoid processing lots of
data at once
Takeaways
● Just because you are streaming, doesn’t mean you can always avoid processing lots of
data at once
● Batch processing techniques are usually more resource efficient for this.
Takeaways
● Just because you are streaming, doesn’t mean you can always avoid processing lots of
data at once
● Batch processing techniques are usually more resource efficient for this.
● Apache Flink has done a lot recently to make sure those two processing modes work well
together in real-world applications.
Takeaways
● Just because you are streaming, doesn’t mean you doesn’t mean you can always avoid
processing lots of data at once
● Batch processing techniques are usually more resource efficient for this.
● Apache Flink has done a lot recently to make sure those two processing modes work well
together in real-world applications.
● Final Savepoints for Batch Jobs is the last mile for Batch Execution in DataStream API.
Thanks
Konstantin Knauf
@snntrable
konstantin@immerok.io
CDC Stream Processing with
Apache Flink
Timo Walther, 2pm, Ballroom G

Contenu connexe

Similaire à When Streaming Needs Batch With Konstantin Knauf | Current 2022

Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Apache Flink Taiwan User Group
 
QCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkQCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkRobert Metzger
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward
 
Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.Konstantinos Kloudas
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a ServiceSteven Wu
 
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...Flink Forward
 
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...HostedbyConfluent
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniMonal Daxini
 
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward
 
Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker seriesMonal Daxini
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionParis Carbone
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!Timo Walther
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streamingdatamantra
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...HostedbyConfluent
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorAljoscha Krettek
 
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 ActsUnlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 ActsHostedbyConfluent
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
Tech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HATech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HAParis Carbone
 
Counting Elements in Streams
Counting Elements in StreamsCounting Elements in Streams
Counting Elements in StreamsJamie Grier
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward
 

Similaire à When Streaming Needs Batch With Konstantin Knauf | Current 2022 (20)

Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
 
QCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache FlinkQCon London - Stream Processing with Apache Flink
QCon London - Stream Processing with Apache Flink
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
 
Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.Introduction to Stateful Stream Processing with Apache Flink.
Introduction to Stateful Stream Processing with Apache Flink.
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a Service
 
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
 
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...
Getting Data In and Out of Flink - Understanding Flink and Its Connector Ecos...
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overviewFlink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
Flink Forward SF 2017: Kenneth Knowles - Back to Sessions overview
 
Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker series
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
 
Building real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark StreamingBuilding real time Data Pipeline using Spark Streaming
Building real time Data Pipeline using Spark Streaming
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
 
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 ActsUnlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Tech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HATech Talk @ Google on Flink Fault Tolerance and HA
Tech Talk @ Google on Flink Fault Tolerance and HA
 
Counting Elements in Streams
Counting Elements in StreamsCounting Elements in Streams
Counting Elements in Streams
 
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...
 

Plus de HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

Plus de HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Dernier

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Dernier (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

When Streaming Needs Batch With Konstantin Knauf | Current 2022

  • 1. When Streaming Needs Batch - Flink's Journey Towards a Unified Engine Konstantin Knauf – @snntrable – Current 22
  • 2. About Me 2 ● Co-Founder @ Immerok ● Apache Flink Committer & Member of the PMC ● Formerly Head of Product & Field Engineer @ Ververica Visit us at booth S14.
  • 3. Steady State - Streaming Heaven 3
  • 4. 4
  • 6. Backfilling or after an introducing a bug
  • 10. Agenda Motivation CHECK Setting the Stage Catching Up & Handling Bursty Streams Backfilling Bootstrapping Takeaways
  • 12. Stream Processing Batch Processing Bounded Data CHECK CHECK Unbounded Data CHECK Some Terminology Nature of Data Nature of Processing
  • 13. A Typical Streaming Job Apache Kafka* in/out * or any other replayable queue like Apache Pulsar/AWS Kinesis/…
  • 14. A Typical Streaming Job Flink Pipeline with multiple operators & shuffles
  • 15. A Typical Streaming Job incl. significant state
  • 20. Catching Up & Bursty Streams 20
  • 21. Scenario There is a large backlog of data and you want to catch up to real-time again. For example, happens when upstream producers send data in big chunks. Wishlist ● process backlog quickly ● process backlog robustly with existing resources Scenario 1
  • 22. Backlog -> Backpressure When under backpressure, what do you want? ● Scale up ● Or catch up steadily
  • 23. Backlog -> Backpressure When under backpressure, what do you want? ● Scale up ● Or catch up steadily
  • 24. Scaling Up under Backpressure Adaptive Scheduler and Reactive Mode (Flink 1.13) Task Manager Task Slot Task Slot
  • 25. Scaling Up under Backpressure Adaptive Scheduler and Reactive Mode (Flink 1.13) Task Manager Task Slot Task Slot Task Manager Task Slot Task Slot
  • 26. Scaling Up under Backpressure Adaptive Scheduler and Reactive Mode (Flink 1.13) Task Manager Task Slot Task Slot Task Manager Task Slot Task Slot ● Job automatically scales up when provided with additional resources ● No additional Savepoint needed for rescaling* * Caveat: This might still take quite some time during restore. Flink 1.16 brings some more improvements in that regard.
  • 27. Backlog -> Backpressure When under backpressure, what do you want? ● Scale up ● Or catch up steadily
  • 28. Robustness under Backpressure Watermark Alignment (Flink 1.15) high throughput topic low throughput topic
  • 29. Robustness under Backpressure Watermark Alignment (Flink 1.15) advances watermark ~ with processing time advances watermark ~ with processing time
  • 30. Robustness under Backpressure Watermark Alignment (Flink 1.15) advances watermark slowly advances watermark very quickly
  • 31. Robustness under Backpressure Watermark Alignment (Flink 1.15) join state grows during processing a backlog
  • 32. Robustness under Backpressure Watermark Alignment (Flink 1.15) advances watermark slowly advances watermark very quickly
  • 33. Robustness under Backpressure Unaligned Checkpoints (Flink 1.11-1.13) & Buffer Debloating (Flink 1.14) checkpoint barrier n 6 x 5 4 3 2 1 f e c d b a h g e f d c y 9 8 7 6 5 4 a y b 1 2 3 input buffer aligning begin aligning operator operator
  • 34. Robustness under Backpressure Unaligned Checkpoints (Flink 1.11-1.13) & Buffer Debloating (Flink 1.14) Under backpressure & at scale checkpoint alignment can take hours leading to checkpoint timeouts and job failures. ● Buffer debloating dramatically reduces the amount of in-flight data ● Unaligned checkpoints allows barriers to overtake in-flight data checkpoint barrier n 6 x 5 4 3 2 1 f e c d b a h g e f d c y 9 8 7 6 5 4 a y b 1 2 3 input buffer aligning begin aligning operator operator
  • 35. Scenario There is a large backlog of data and you want to catch up to real-time again. For example, happens when upstream producers send data in big chunks. Wishlist ● process backlog quickly CHECK (Adaptive Scheduler) ● process backlog robustly with existing resources CHECK (Buffer Debloating, Watermark Alignment, Unaligned Checkpoints) Scenario
  • 37. Scenario You want to reprocess a fixed amount of (recent) historical data to correct a bug or outage. Wishlist ● Code-reuse for backfilling ● Same semantics and complete & correct results ● Resource efficient Scenario [1] https://www.youtube.com/watch?v=4qSlsYogALo&t=668s 1
  • 38. 38 - The Apache Flink Community Batch is a Special Case of Streaming
  • 41. DataStream API with Streaming Execution All the elasticity and robustness improvements for processing under backpressure apply to here, too.
  • 42. Scenario We want to reprocess a fixed amount of (recent) historical data to correct a bug or outage. Wishlist ● Code-reuse for backfilling CHECK ● Same semantics and complete & correct results CHECK ● Resource efficient Evaluation Stream Execution Mode
  • 43. Batch Execution Mode Implementation Timeline ● Apache Flink 1.12 ○ Initial Release ○ Unified Sink API (beta) ● Apache Flink 1.13 ○ Support for Python DataStream API ● Apache Flink 1.14 ○ Batch Execution Mode for mixed DataStream/Table API programs ○ Unified Sink API stable ○ Unified Source API stable ● Apache Flink 1.15 ○ Most Sources/Sinks migrated to unified interfaces ○ Adaptive Batch Scheduler
  • 48. Why is Batch Processing faster?
  • 49. It all boils down to completeness & latency. Stream Processing Batch Processing Data is incomplete Data is complete Latency SLAs No Latency SLAs
  • 50. Batch Execution vs Stream Execution
  • 51. Batch Execution vs Stream Execution Data Exchange Mode
  • 52. Batch Execution vs Stream Execution Data Exchange Mode
  • 53. Batch Execution vs Stream Execution Data Exchange Mode
  • 54. Batch Execution vs Stream Execution Data Exchange Mode
  • 55. Batch Execution vs Stream Execution Data Exchange Mode
  • 56. Batch Execution vs Stream Execution Fault Tolerance Stream Processing
  • 57. Batch Execution vs Stream Execution Fault Tolerance Object Store (S3, GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing
  • 58. Batch Execution vs Stream Execution Fault Tolerance Object Store (S3, GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing Batch Processing
  • 59. Batch Execution vs Stream Execution Fault Tolerance Object Store (S3, GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing Batch Processing
  • 60. Batch Execution vs Stream Execution Fault Tolerance Object Store (S3, GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing Batch Processing
  • 61. Batch Execution vs Stream Execution Fault Tolerance Object Store (S3, GCS, HDFS, …) Periodic Snapshots Stream Processing Checkpointing Batch Processing Backtracking Local Disk or External Shuffle Service
  • 62. Batch Execution vs Stream Execution Processing Order and State Backends 0 0 Stream Processing
  • 63. Batch Execution vs Stream Execution Processing Order and State 0 1 Stream Processing
  • 64. Batch Execution vs Stream Execution Processing Order and State 1 1 Stream Processing
  • 65. Batch Execution vs Stream Execution Processing Order and State 1 2 Stream Processing
  • 66. Batch Execution vs Stream Execution Processing Order and State 2 2 Stream Processing
  • 67. Batch Execution vs Stream Execution Processing Order and State Stream Processing Keys are processed simultaneously. 2 2
  • 68. Batch Execution vs Stream Execution Processing Order and State 2 2 Batch Processing Stream Processing Keys are processed simultaneously.
  • 69. Batch Execution vs Stream Execution Processing Order and State 2 5 Stream Processing Keys are processed simultaneously. Batch Processing
  • 70. Batch Execution vs Stream Execution Processing Order and State 2 2 5 Stream Processing Keys are processed simultaneously. Batch Processing
  • 71. Batch Execution vs Stream Execution Processing Order and State 2 2 6 5 Stream Processing Keys are processed simultaneously. Batch Processing
  • 72. Batch Execution vs Stream Execution Processing Order and State 2 2 5 6 Stream Processing Keys are processed simultaneously. Batch Processing
  • 73. Batch Execution vs Stream Execution Processing Order and State 2 2 5 6 Stream Processing Keys are processed simultaneously. Batch Processing Keys are processed one after another.
  • 74. Batch Execution vs Stream Execution Time ● Does Processing Time make sense when processing historical data? ○ Not really. ○ All processing time timers fire at the end of the input.
  • 75. Batch Execution vs Stream Execution Time ● Does Processing Time make sense when processing historical data? ○ Not really. ○ All processing time timers fire at the end of the input. ● Does historical data arrive out-of-order? ○ No, as it is complete we can sort it by timestamp if needed.
  • 76. Batch Execution vs Stream Execution Time ● Does Processing Time make sense when processing historical data? ○ Not really. ○ All processing time timers fire at the end of the input. ● Does historical data arrive out-of-order? ○ No, as it is complete we can sort it by timestamp if needed. ● Do watermarks make sense in batch processing? ○ No, we don’t need them. There is no trade off between latency and completeness. ○ Watermark jumps from -∞ to +∞. All event time timers fire at the end of the input.
  • 77. Batch Execution vs Stream Execution Summary Stream Processing Batch Processing Data Exchange Mode Pipelined Blocking Fault Tolerance Checkpointing Backtracking Processing Order All keys simultaneously Keys one by one Time ● Event processed out-of-order ● Event and Processing Time ● Watermarks ● Events processed by event time for each key ● Only Eventtime ● No Watermarks
  • 78. Scenario We want to reprocess a fixed amount of (recent) historical data to correct a bug or outage. Wishlist ● Code-reuse for backfilling CHECK ● Same semantics and complete & correct results CHECK ● Resource efficient CHECK (Potential Caveat: Resource Consumption? See Uber Talk.) Evaluation Batch Execution Mode
  • 80. Scenario We want to process historical data (weeks, months, year) to build up the applications state before switching the application to real-time data. Wishlist ● Code-reuse for bootstrapping ● Different data source for bootstrapping ● Resource efficient Scenario [1] https://www.youtube.com/watch?v=BTWntKy_MJs [2] https://www.youtube.com/watch?v=JQyfXEQqKeg [3] https://www.youtube.com/watch?v=JKndMiXphzw 1 2 3
  • 81. Bootstrapping with the Hybrid Source Hybrid Source automates switching of sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention S3
  • 82. Bootstrapping with the Hybrid Source Hybrid Source automates switching of sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention Unbounded Source Bounded Source S3
  • 83. Bootstrapping with the Hybrid Source Hybrid Source automates switching of sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention S3 All the elasticity and robustness improvements for processing under backpressure apply to here, too.
  • 84. Bootstrapping with the Hybrid Source Hybrid Source automates switching of sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention End of Input Reached S3
  • 85. Bootstrapping with the Hybrid Source Hybrid Source automates switching of sources from historical data to real-time data within a single streaming Job. S3 Kafka hours retention years retention S3
  • 86. Bootstrapping with the Hybrid Source
  • 87. Bootstrapping with the Hybrid Source
  • 88. Scenario We want to process historical data (weeks, months, year) to build up the applications state before switching the application to real-time data. Wishlist ● Code-reuse for bootstrapping 🗸 ● Different data source for bootstrapping 🗸 ● Resource efficient Evaluation Bootstrapping with Hybrid Source
  • 89. Bootstrapping with Batch Execution /dev/null Bootstrapping Job With Batch Execution Separate Data Source Discarding Sink S3
  • 90. Bootstrapping with Batch Execution Savepoint /dev/null Bootstrapping Job With Batch Execution Separate Data Source Discarding Sink produces a final Savepoint S3
  • 91. Bootstrapping with Batch Execution Savepoint /dev/null Bootstrapping Job With Batch Execution Separate Data Source Discarding Sink Real-Time Job With Stream Execution produces a final Savepoint takes a final Savepoint as initial state S3 Pre-Release!
  • 92. Demo
  • 93. Demo
  • 94. Final Savepoints for Batch Jobs Next Steps 1. Still some limitations & open questions to address in prototype 2. Publish FLIP & discuss with the Community 3. We are optimistic about Flink 1.17.
  • 95. Scenario We want to process historical data (weeks, months, year) to build up the applications state before switching the application to real-time data. Wishlist ● Code-reuse for bootstrapping CHECK ● Different data source for bootstrapping CHECK ● Resource efficient CHECK Evaluation Bootstrapping with Batch Execution
  • 97. Takeaways ● Just because you are streaming, doesn’t mean you can always avoid processing lots of data at once
  • 98. Takeaways ● Just because you are streaming, doesn’t mean you can always avoid processing lots of data at once ● Batch processing techniques are usually more resource efficient for this.
  • 99. Takeaways ● Just because you are streaming, doesn’t mean you can always avoid processing lots of data at once ● Batch processing techniques are usually more resource efficient for this. ● Apache Flink has done a lot recently to make sure those two processing modes work well together in real-world applications.
  • 100. Takeaways ● Just because you are streaming, doesn’t mean you doesn’t mean you can always avoid processing lots of data at once ● Batch processing techniques are usually more resource efficient for this. ● Apache Flink has done a lot recently to make sure those two processing modes work well together in real-world applications. ● Final Savepoints for Batch Jobs is the last mile for Batch Execution in DataStream API.
  • 101. Thanks Konstantin Knauf @snntrable konstantin@immerok.io CDC Stream Processing with Apache Flink Timo Walther, 2pm, Ballroom G