SlideShare une entreprise Scribd logo
1  sur  34
1
Timo Walther
Apache Flink PMC
@twalthr
Flink Forward @ San Francisco - April 11th, 2017
Table & SQL API
unified APIs for batch and stream processing
Motivation
2
DataStream API is great…
3
 Very expressive stream processing
• Transform data, update state, define windows, aggregate, etc.
 Highly customizable windowing logic
• Assigners, Triggers, Evictors, Lateness
 Asynchronous I/O
• Improve communication to external systems
 Low-level Operations
• ProcessFunction gives access to timestamps and timers
… but it is not for Everyone!
4
 Writing DataStream programs is not always easy
• Stream processing technology spreads rapidly
• New streaming concepts (time, state, windows, ...)
 Requires knowledge & skill
• Continous applications have special requirements
• Programming experience (Java / Scala)
 Users want to focus on their business logic
Why not a Relational API?
5
 Relational API is declarative
• User says what is needed, system decides how to compute it
 Queries can be effectively optimized
• Less black-boxes, well-researched field
 Queries are efficiently executed
• Let Flink handle state, time, and common mistakes
 ”Everybody” knows and uses SQL!
Goals
 Easy, declarative, and concise relational API
 Tool for a wide range of use cases
 Relational API as a unifying layer
• Queries on batch tables terminate and produce a finite result
• Queries on streaming tables run continuously and produce result
stream
 Same syntax & semantics for both queries
6
Table API & SQL
7
Table API & SQL
 Flink features two relational APIs
• Table API: LINQ-style API for Java & Scala (since Flink 0.9.0)
• SQL: Standard SQL (since Flink 1.1.0)
8
DataSet API DataStream API
Table API
SQL
Flink Dataflow Runtime
Table API & SQL Example
9
val tEnv = TableEnvironment.getTableEnvironment(env)
// configure your data source
val customerSource = CsvTableSource.builder()
.path("/path/to/customer_data.csv")
.field("name", Types.STRING).field("prefs", Types.STRING)
.build()
// register as a table
tEnv.registerTableSource(”cust", customerSource)
// define your table program
val table = tEnv.scan("cust").select('name.lowerCase(), myParser('prefs))
val table = tEnv.sql("SELECT LOWER(name), myParser(prefs) FROM cust")
// convert
val ds: DataStream[Customer] = table.toDataStream[Customer]
Windowing in Table API
10
val sensorData: DataStream[(String, Long, Double)] = ???
// convert DataStream into Table
val sensorTable: Table = sensorData
.toTable(tableEnv, 'location, 'rowtime, 'tempF)
// define query on Table
val avgTempCTable: Table = sensorTable
.window(Tumble over 1.day on 'rowtime as 'w)
.groupBy('location, ’w)
.select('w.start as 'day,
'location,
(('tempF.avg - 32) * 0.556) as 'avgTempC)
.where('location like "room%")
Windowing in SQL
11
val sensorData: DataStream[(String, Long, Double)] = ???
// register DataStream
tableEnv.registerDataStream(
"sensorData", sensorData, 'location, 'rowtime, 'tempF)
// query registered Table
val avgTempCTable: Table = tableEnv.sql("""
SELECT TUMBLE_START(TUMBLE(time, INTERVAL '1' DAY) AS day,
location,
AVG((tempF - 32) * 0.556) AS avgTempC
FROM sensorData
WHERE location LIKE 'room%’
GROUP BY location, TUMBLE(time, INTERVAL '1' DAY)
""")
Architecture
2 APIs [SQL, Table API]
*
2 backends [DataStream, DataSet]
=
4 different translation paths?
12
Architecture
13
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
Architecture
14
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
Architecture
15
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
Architecture
16
DataSet Rules
DataSet PlanDataSet DataStreamDataStream Plan
DataStream Rules
Calcite Catalog
Calcite Logical Plan
Calcite Optimizer
Calcite
Parser & Validator
Table API SQL API
DataSet
Table
Sources
DataStream
Table API Validator
Translation to Logical Plan
17
sensorTable
.window(Tumble over 1.day on 'rowtime as 'w)
.groupBy('location, ’w)
.select(
'w.start as 'day,
'location,
(('tempF.avg - 32) *
0.556) as 'avgTempC)
.where('location like "room%")
Catalog Node
Window Aggregate
Project
Filter
Logical Table Scan
Logical Window
Aggregate
Logical Project
Logical Filter
Table Nodes Calcite Logical Plan
Table API Validation
Translation
Translation to DataStream Plan
18
Logical Table Scan
Logical Window
Aggregate
Logical Project
Logical Filter
Calcite Logical Plan
Logical Table Scan
Logical Window
Aggregate
Logical Calc
Optimized Plan
DataStream Scan
DataStream Calc
DataStream
Aggregate
DataStream Plan
Optimize
Transform
Translation to Flink Program
19
DataStream Scan
DataStream Calc
DataStream
Aggregate
DataStream Plan
(Forwarding)
FlatMap Function
Aggregate & Window
Function
DataStream Program
Translate &
Code-generate
Current State (in master)
 Batch support
• Selection, Projection, Sort, Inner & Outer Joins, Set operations
• Group-Windows for Slide, Tumble, Session
 Streaming support
• Selection, Projection, Union
• Group-Windows for Slide, Tumble, Session
• Different SQL OVER-Windows (RANGE/ROWS)
 UDFs, UDTFs, custom rules
20
Use Cases for Streaming SQL
 Continuous ETL & Data Import
 Live Dashboards & Reports
21
Outlook: Dynamic Tables
22
Dynamic Tables Model
 Dynamic tables change over time
 Dynamic tables are treated like static batch tables
• Dynamic tables are queried with standard SQL / Table API
• Every query returns another Dynamic Table
 “Stream / Table Duality”
• Stream ←→ Dynamic Table
conversions without information loss
23
Stream to Dynamic Table
 Append Mode:
 Update Mode:
24
Querying Dynamic Tables
 Dynamic tables change over time
• A[t]: Table A at specific point in time t
 Dynamic tables are queried with relational semantics
• Result of a query changes as input table changes
• q(A[t]): Evaluate query q on table A at time t
 Query result is continuously updated as t progresses
• Similar to maintaining a materialized view
• t is current event time
25
Querying a Dynamic Table
26
Querying a Dynamic Table
27
Querying a Dynamic Table
 Can we run any query on Dynamic Tables? No!
 State may not grow infinitely as more data arrives
• Set clean-up timeout or key constraints.
 Input may only trigger partial re-computation
 Queries with possibly unbounded state or computation
are rejected
28
Dynamic Table to Stream
 Convert Dynamic Table modifications into stream
messages
 Similar to database logging techniques
• Undo: previous value of a modified element
• Redo: new value of a modified element
• Undo+Redo: old and the new value of a changed element
 For Dynamic Tables: Redo or Undo+Redo
29
Dynamic Table to Stream
 Undo+Redo Stream (because A is in Append Mode):
30
Dynamic Table to Stream
 Redo Stream (because A is in Update Mode):
31
Result computation & refinement
32
First result
(end – x)
Last result
(end + x)
State is purged.
Late updates
(on new data)
Update rate
(every x)
Complete
result
(end + x)
Complete result can be computed
(end)
Contributions welcome!
 Huge interest and many contributors
• Adding more window operators
• Introducing dynamic tables
 And there is a lot more to do
• New operators and features for streaming and batch
• Performance improvements
• Tooling and integration
 Try it out, give feedback, and start contributing!
33
3
Thank you!
@twalthr
@ApacheFlink
@dataArtisans

Contenu connexe

Tendances

Tendances (20)

Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017Hoodie - DataEngConf 2017
Hoodie - DataEngConf 2017
 
Ceph issue 해결 사례
Ceph issue 해결 사례Ceph issue 해결 사례
Ceph issue 해결 사례
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop Tutorial
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in FlinkMaxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
Maxim Fateev - Beyond the Watermark- On-Demand Backfilling in Flink
 
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016 A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
 

Similaire à Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for batch and stream processing

Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Ververica
 

Similaire à Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for batch and stream processing (20)

Timo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processingTimo Walther - Table & SQL API - unified APIs for batch and stream processing
Timo Walther - Table & SQL API - unified APIs for batch and stream processing
 
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processingApache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
 
Towards sql for streams
Towards sql for streamsTowards sql for streams
Towards sql for streams
 
Fabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache FlinkFabian Hueske - Stream Analytics with SQL on Apache Flink
Fabian Hueske - Stream Analytics with SQL on Apache Flink
 
Stream Analytics with SQL on Apache Flink
 Stream Analytics with SQL on Apache Flink Stream Analytics with SQL on Apache Flink
Stream Analytics with SQL on Apache Flink
 
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
 
Apache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya MeetupApache Flink @ Tel Aviv / Herzliya Meetup
Apache Flink @ Tel Aviv / Herzliya Meetup
 
Apache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and FriendsApache Flink Overview at SF Spark and Friends
Apache Flink Overview at SF Spark and Friends
 
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
Data Stream Analytics - Why they are important
Data Stream Analytics - Why they are importantData Stream Analytics - Why they are important
Data Stream Analytics - Why they are important
 
Streaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+TablesStreaming SQL Foundations: Why I ❤ Streams+Tables
Streaming SQL Foundations: Why I ❤ Streams+Tables
 
Flink's SQL Engine: Let's Open the Engine Room!
Flink's SQL Engine: Let's Open the Engine Room!Flink's SQL Engine: Let's Open the Engine Room!
Flink's SQL Engine: Let's Open the Engine Room!
 
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIsFabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
Fabian Hueske - Taking a look under the hood of Apache Flink’s relational APIs
 
Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.Taking a look under the hood of Apache Flink's relational APIs.
Taking a look under the hood of Apache Flink's relational APIs.
 
Fast federated SQL with Apache Calcite
Fast federated SQL with Apache CalciteFast federated SQL with Apache Calcite
Fast federated SQL with Apache Calcite
 
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
Distributed Real-Time Stream Processing: Why and How: Spark Summit East talk ...
 
Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017Distributed Stream Processing - Spark Summit East 2017
Distributed Stream Processing - Spark Summit East 2017
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
 

Plus de Flink Forward

Plus de Flink Forward (20)

“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 
Welcome to the Flink Community!
Welcome to the Flink Community!Welcome to the Flink Community!
Welcome to the Flink Community!
 
Extending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use casesExtending Flink SQL for stream processing use cases
Extending Flink SQL for stream processing use cases
 
The top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scaleThe top 3 challenges running multi-tenant Flink at scale
The top 3 challenges running multi-tenant Flink at scale
 

Dernier

Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 

Dernier (20)

Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 

Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for batch and stream processing

  • 1. 1 Timo Walther Apache Flink PMC @twalthr Flink Forward @ San Francisco - April 11th, 2017 Table & SQL API unified APIs for batch and stream processing
  • 3. DataStream API is great… 3  Very expressive stream processing • Transform data, update state, define windows, aggregate, etc.  Highly customizable windowing logic • Assigners, Triggers, Evictors, Lateness  Asynchronous I/O • Improve communication to external systems  Low-level Operations • ProcessFunction gives access to timestamps and timers
  • 4. … but it is not for Everyone! 4  Writing DataStream programs is not always easy • Stream processing technology spreads rapidly • New streaming concepts (time, state, windows, ...)  Requires knowledge & skill • Continous applications have special requirements • Programming experience (Java / Scala)  Users want to focus on their business logic
  • 5. Why not a Relational API? 5  Relational API is declarative • User says what is needed, system decides how to compute it  Queries can be effectively optimized • Less black-boxes, well-researched field  Queries are efficiently executed • Let Flink handle state, time, and common mistakes  ”Everybody” knows and uses SQL!
  • 6. Goals  Easy, declarative, and concise relational API  Tool for a wide range of use cases  Relational API as a unifying layer • Queries on batch tables terminate and produce a finite result • Queries on streaming tables run continuously and produce result stream  Same syntax & semantics for both queries 6
  • 7. Table API & SQL 7
  • 8. Table API & SQL  Flink features two relational APIs • Table API: LINQ-style API for Java & Scala (since Flink 0.9.0) • SQL: Standard SQL (since Flink 1.1.0) 8 DataSet API DataStream API Table API SQL Flink Dataflow Runtime
  • 9. Table API & SQL Example 9 val tEnv = TableEnvironment.getTableEnvironment(env) // configure your data source val customerSource = CsvTableSource.builder() .path("/path/to/customer_data.csv") .field("name", Types.STRING).field("prefs", Types.STRING) .build() // register as a table tEnv.registerTableSource(”cust", customerSource) // define your table program val table = tEnv.scan("cust").select('name.lowerCase(), myParser('prefs)) val table = tEnv.sql("SELECT LOWER(name), myParser(prefs) FROM cust") // convert val ds: DataStream[Customer] = table.toDataStream[Customer]
  • 10. Windowing in Table API 10 val sensorData: DataStream[(String, Long, Double)] = ??? // convert DataStream into Table val sensorTable: Table = sensorData .toTable(tableEnv, 'location, 'rowtime, 'tempF) // define query on Table val avgTempCTable: Table = sensorTable .window(Tumble over 1.day on 'rowtime as 'w) .groupBy('location, ’w) .select('w.start as 'day, 'location, (('tempF.avg - 32) * 0.556) as 'avgTempC) .where('location like "room%")
  • 11. Windowing in SQL 11 val sensorData: DataStream[(String, Long, Double)] = ??? // register DataStream tableEnv.registerDataStream( "sensorData", sensorData, 'location, 'rowtime, 'tempF) // query registered Table val avgTempCTable: Table = tableEnv.sql(""" SELECT TUMBLE_START(TUMBLE(time, INTERVAL '1' DAY) AS day, location, AVG((tempF - 32) * 0.556) AS avgTempC FROM sensorData WHERE location LIKE 'room%’ GROUP BY location, TUMBLE(time, INTERVAL '1' DAY) """)
  • 12. Architecture 2 APIs [SQL, Table API] * 2 backends [DataStream, DataSet] = 4 different translation paths? 12
  • 13. Architecture 13 DataSet Rules DataSet PlanDataSet DataStreamDataStream Plan DataStream Rules Calcite Catalog Calcite Logical Plan Calcite Optimizer Calcite Parser & Validator Table API SQL API DataSet Table Sources DataStream Table API Validator
  • 14. Architecture 14 DataSet Rules DataSet PlanDataSet DataStreamDataStream Plan DataStream Rules Calcite Catalog Calcite Logical Plan Calcite Optimizer Calcite Parser & Validator Table API SQL API DataSet Table Sources DataStream Table API Validator
  • 15. Architecture 15 DataSet Rules DataSet PlanDataSet DataStreamDataStream Plan DataStream Rules Calcite Catalog Calcite Logical Plan Calcite Optimizer Calcite Parser & Validator Table API SQL API DataSet Table Sources DataStream Table API Validator
  • 16. Architecture 16 DataSet Rules DataSet PlanDataSet DataStreamDataStream Plan DataStream Rules Calcite Catalog Calcite Logical Plan Calcite Optimizer Calcite Parser & Validator Table API SQL API DataSet Table Sources DataStream Table API Validator
  • 17. Translation to Logical Plan 17 sensorTable .window(Tumble over 1.day on 'rowtime as 'w) .groupBy('location, ’w) .select( 'w.start as 'day, 'location, (('tempF.avg - 32) * 0.556) as 'avgTempC) .where('location like "room%") Catalog Node Window Aggregate Project Filter Logical Table Scan Logical Window Aggregate Logical Project Logical Filter Table Nodes Calcite Logical Plan Table API Validation Translation
  • 18. Translation to DataStream Plan 18 Logical Table Scan Logical Window Aggregate Logical Project Logical Filter Calcite Logical Plan Logical Table Scan Logical Window Aggregate Logical Calc Optimized Plan DataStream Scan DataStream Calc DataStream Aggregate DataStream Plan Optimize Transform
  • 19. Translation to Flink Program 19 DataStream Scan DataStream Calc DataStream Aggregate DataStream Plan (Forwarding) FlatMap Function Aggregate & Window Function DataStream Program Translate & Code-generate
  • 20. Current State (in master)  Batch support • Selection, Projection, Sort, Inner & Outer Joins, Set operations • Group-Windows for Slide, Tumble, Session  Streaming support • Selection, Projection, Union • Group-Windows for Slide, Tumble, Session • Different SQL OVER-Windows (RANGE/ROWS)  UDFs, UDTFs, custom rules 20
  • 21. Use Cases for Streaming SQL  Continuous ETL & Data Import  Live Dashboards & Reports 21
  • 23. Dynamic Tables Model  Dynamic tables change over time  Dynamic tables are treated like static batch tables • Dynamic tables are queried with standard SQL / Table API • Every query returns another Dynamic Table  “Stream / Table Duality” • Stream ←→ Dynamic Table conversions without information loss 23
  • 24. Stream to Dynamic Table  Append Mode:  Update Mode: 24
  • 25. Querying Dynamic Tables  Dynamic tables change over time • A[t]: Table A at specific point in time t  Dynamic tables are queried with relational semantics • Result of a query changes as input table changes • q(A[t]): Evaluate query q on table A at time t  Query result is continuously updated as t progresses • Similar to maintaining a materialized view • t is current event time 25
  • 26. Querying a Dynamic Table 26
  • 27. Querying a Dynamic Table 27
  • 28. Querying a Dynamic Table  Can we run any query on Dynamic Tables? No!  State may not grow infinitely as more data arrives • Set clean-up timeout or key constraints.  Input may only trigger partial re-computation  Queries with possibly unbounded state or computation are rejected 28
  • 29. Dynamic Table to Stream  Convert Dynamic Table modifications into stream messages  Similar to database logging techniques • Undo: previous value of a modified element • Redo: new value of a modified element • Undo+Redo: old and the new value of a changed element  For Dynamic Tables: Redo or Undo+Redo 29
  • 30. Dynamic Table to Stream  Undo+Redo Stream (because A is in Append Mode): 30
  • 31. Dynamic Table to Stream  Redo Stream (because A is in Update Mode): 31
  • 32. Result computation & refinement 32 First result (end – x) Last result (end + x) State is purged. Late updates (on new data) Update rate (every x) Complete result (end + x) Complete result can be computed (end)
  • 33. Contributions welcome!  Huge interest and many contributors • Adding more window operators • Introducing dynamic tables  And there is a lot more to do • New operators and features for streaming and batch • Performance improvements • Tooling and integration  Try it out, give feedback, and start contributing! 33

Notes de l'éditeur

  1. DATASTREAM: event-time semantics, stateful exactly-once processing, high throughput & low latency at the same time compute exact and deterministic results in real-time ASYNC: enrich stream events with data stored in a database, communication delay with external system does not dominate the streaming application’s total work
  2. There is a talent gap. SKILL: Memory-bound Handling of timestamps and watermarks in ProcessFunctions API which quickly solves 80% of their use cases where simple tasks can be defined using little code. BUSINESS: No null support, no timestamps, no common tools for string normalization etc.
  3. Users do not specify implementation. UDF: great for expressiveness, bad for optimization - need for manual tuning ProcessFunctions implemented in Flink handle state and time. SQL is the most widely used language for data analytics SQL would make stream processing much more accessible
  4. Flink is a platform for distributed stream and batch data processing users only need to learn a single API a query produces exactly the same result regardless whether its input is static batch data or streaming data
  5. TABLE API: - Language INtegrated Query (LINQ) API - Queries are not embedded as String - Centered around Table objects - Operations are applied on Tables and return a Table SQL: - Standard SQL - Queries are embedded as Strings into programs - Referenced tables must be registered - Queries return a Table object - Integration with Table API Equivalent feature set (at the moment) Table API and SQL can be mixed Both are tightly integrated with Flink’s core API often referred to as the Table API
  6. Works for both batch and stream.
  7. Works for both batch and stream.
  8. Maintain standard SQL Compliant
  9. Apache Calcite is a SQL parsing and query optimizer framework Used by many other projects to parse and optimize SQL queries Apache Drill, Apache Hive, Apache Kylin, Cascading, …
  10. Tables, columns, and data types are stored in a central catalog Tables can be created from DataSets DataStreams TableSources (without going through DataSet/DataStream API)
  11. Table API and SQL queries are translated into common logical plan representation.
  12. Logical plans are translated and optimized depending on execution backend. Plans are transformed into DataSet or DataStream programs.
  13. API calls are translated into logical operators and immediately validated API operators compose a tree Before optimization, the API operator tree is translated into a logical Calcite plan
  14. Calcite provides many optimization rules Custom rules to transform logical nodes into Flink nodes DataSet rules to translate batch queries DataStream rules to translate streaming queries Constant expression reduction
  15. into DataStream or DataSet operators With continous queries in mind! Janino Compiler Framework Arriving data is incrementally aggregated using the given aggregate function. This means that the window function typically has only a single value to process when called.
  16. RANGE UNBOUNDED preceding ROWS BETWEEN 5 preceding AND CURRENT ROW RANGE BETWEEN INTERVAL '1' SECOND preceding AND CURRENT ROW BUT: relational processing of batch is well defined and understood not all relational operators can be naively applied on streams no widely accepted definition of the semantics for relational processing of streaming data all supported operators have in common that they never update result records which have been emitted not an issue for record-at-a-time operators such as projection and filter affects operators that collect and process multiple records: emitted results cannot be updated, late event are discarded
  17. emit data to log-style system where emitted data cannot be updated (results cannot be refined) only append operations, no updates or delete many streaming analytics use cases that need to update results no discarding possible early results needed can be updated and refined analyze and explore streaming data in a real-time fashion
  18. vastly increase the scope of the APIs and the range of supported use cases can be challenging to implement using the DataStream API
  19. It is important to note that this is only the logical model and does not imply how the query is actually executed RETURNS: runs continuously and produces a table that is continuously updated -> Dynamic Table result-updating queries But: we must of course preserve the unified semantics for stream and batch inputs
  20. we have to specify how the records of a stream modify the dynamic table APPEND: each stream record is an insert modification conceptually the dynamic table is ever-growing and infinite in size UPDATE/REPLACE: specify a unique key attribute stream record can represent an insert, update, or delete modification append mode is in fact a special case of update mode
  21. Let’s imagine we take a snapshot of a dynamic table at a specific point in time. Snapshot is like a regular static batch table. PROGRESS: If we repeatedly compute the result of a query on snapshots for progressing points in time, we obtain many static result tables. They are changing over time and effectively constitute a dynamic table. At each point in time t, the result table is equivalent to a batch query on the dynamic table A at time t.
  22. APPENDMODE: query continuously updates result rows that it had previously emitted instead of only adding new rows size of the result table depends on the number of distinct grouping keys of the input table NOTE: As long as it is not emitted these changes are completely internal and not visible to a user. changes materialize when a dynamic table is emitted
  23. In contrast to the result of the first example, the resulting table grows relative to the time. While the non-windowed query (mostly) updates rows of the result table, the windowed aggregation query only appends new rows to the result table.
  24. only those that can be continuously, incrementally, and efficiently computed
  25. Traditional database systems use logs to rebuild tables in case of failures and for replication. UNDO logs contain the previous value of a modified element to revert incomplete transactions REDO logs contain the new value of a modified element to redo lost changes of completed transactions UNDO/REDO logs contain the old and the new value of a changed element
  26. An insert modification is emitted as an insert message with the new row, a delete modification is emitted as a delete message with the old row, and an update modification is emitted as a delete message with the old row and an insert message with the new row Current processing model is a subset of the dynamic table model. downstream operators or data sinks need to be able to correctly handle both types of messages
  27. only tables with unique keys can have update and delete modifications. the downstream operators need to be able to access previous values by key.
  28. val outputOpts = OutputOptions()  .firstResult(-15.minutes)    // first result 15 mins early  .completeResult(+5.minutes)  // complete result 5 mins late  .updateRate(3.minutes)       // result is updated every 3 mins  .lateUpdates(true)           // late result updates enabled  .lastResult(+15.minutes)     // last result 15 mins late -> until state is purged
  29. More TableSource and Sinks Generated intermediate data types, serializers, comparators Standalone SQL client Code generated aggregate functions