SQL is the lingua franca for querying and processing data. To this day, it provides non-programmers with a powerful tool for analyzing and manipulating data. But with the emergence of stream processing as a core technology for data infrastructures, can you still use SQL and bring real-time data analysis to a broader audience?
The answer is yes, you can. SQL fits into the streaming world very well and forms an intuitive and powerful abstraction for streaming analytics. More importantly, you can use SQL as an abstraction to unify batch and streaming data processing. Viewing streams as dynamic tables, you can obtain consistent results from SQL evaluated over static tables and streams alike and use SQL to build materialized views as a data integration tool.
Fabian Hueske and Shuyi Chen explore SQL’s role in the world of streaming data and its implementation in Apache Flink and cover fundamental concepts, such as streaming semantics, event time, and incremental results. They also share their experience using Flink SQL in production at Uber, explaining how Uber leverages Flink SQL to solve its unique business challenges and how the unified stream and batch processing platform enables both technical or nontechnical users to process real-time and batch data reliably using the same SQL at Uber scale.
Streaming SQL to unify batch and stream processing: Theory and practice with Apache Flink at Uber
1. 1
Streaming SQL
to Unify Batch and Stream Processing:
Theory and Practice with Apache Flink at Uber
Strata Data Conference,
San Jose
March, 7th 2018
Fabian Hueske Shuyi Chen
2. What is Apache Flink?
2
Batch Processing
process static and
historic data
Data Stream
Processing
realtime results
from data streams
Event-driven
Applications
data-driven actions
and services
Stateful Computations Over Data Streams
3. What is Apache Flink?
3
Queries
Applications
Devices
etc.
Database
Stream
File / Object
Storage
Stateful computations over streams
real-time and historic
fast, scalable, fault tolerant, in-memory,
event time, large state, exactly-once
Historic
Data
Streams
Application
4. Hardened at scale
4
Streaming Platform Service
billions messages per day
A lot of Stream SQL
Streaming Platform as a Service
3700+ container running Flink,
1400+ nodes, 22k+ cores, 100s of jobs
Fraud detection
Streaming Analytics Platform
100s jobs, 1000s nodes, TBs state,
metrics, analytics, real time ML,
Streaming SQL as a platform
5. Powerful Abstractions
5
Process Function (events, state, time)
DataStream API (streams, windows)
SQL / Table API (dynamic tables)
Stream- & Batch
Data Processing
High-level
Analytics API
Stateful Event-
Driven Applications
val stats = stream
.keyBy("sensor")
.timeWindow(Time.seconds(5))
.sum((a, b) -> a.add(b))
def processElement(event: MyEvent, ctx: Context, out: Collector[Result]) = {
// work with event and state
(event, state.value) match { … }
out.collect(…) // emit events
state.update(…) // modify state
// schedule a timer callback
ctx.timerService.registerEventTimeTimer(event.timestamp + 500)
}
Layered abstractions to
navigate simple to complex use cases
6. Apache Flink’s Relational APIs
Unified APIs for batch & streaming data
A query specifies exactly the same result
regardless whether its input is
static batch data or streaming data.
6
tableEnvironment
.scan("clicks")
.groupBy('user)
.select('user, 'url.count as 'cnt)
SELECT user, COUNT(url) AS cnt
FROM clicks
GROUP BY user
LINQ-style Table APIANSI SQL
8. What if “clicks” is a file?
8
Clicks
user cTime url
Mary 12:00:00 https://…
Bob 12:00:00 https://…
Mary 12:00:02 https://…
Liz 12:00:03 https://…
user cnt
Mary 2
Bob 1
Liz 1
SELECT
user,
COUNT(url) as cnt
FROM clicks
GROUP BY user
Input data is
read at once
Result is produced
at once
9. What if “clicks” is a stream?
9
user cTime url
user cnt
SELECT
user,
COUNT(url) as cnt
FROM clicks
GROUP BY user
Clicks
Mary 12:00:00 https://…
Bob 12:00:00 https://…
Mary 12:00:02 https://…
Liz 12:00:03 https://…
Bob 1
Liz 1
Mary 1Mary 2
Input data is
continuously read
Result is continuously
produced
The result is identical!
10. Why is stream-batch unification important?
Usability
• ANSI SQL syntax: No custom “StreamSQL” syntax.
• ANSI SQL semantics: No stream-specific results.
Portability
• Run the same query on bounded and unbounded data
• Run the same query on recorded and real-time data
Do we need to soften SQL semantics for streaming? 10
now
bounded query
unbounded query
past future
bounded query
start of the stream
unbounded query
11. DBMSs Run Queries on Streams
Materialized views (MV) are similar to regular views,
but persisted to disk or memory
• Used to speed-up analytical queries
• MVs need to be updated when the base tables change
MV maintenance is very similar to SQL on streams
• Base table updates are a stream of DML statements
• MV definition query is evaluated on that stream
• MV is query result and continuously updated
11
12. Continuous Queries in Flink
Core concept is a “Dynamic Table”
• Dynamic tables are changing over time
Queries on dynamic tables
• produce new dynamic tables (which are updated based on input)
• do not terminate
Stream ↔ Dynamic table conversions
12
13. Stream ↔ Dynamic Table Conversions
Append Conversions
• Records are only inserted/appended
Upsert Conversions
• Records are inserted/updated/deleted and have a
(composite) unique key
Changelog Conversions
• Records are inserted/updated/deleted
13
14. SQL Feature Set in Flink 1.5.0
SELECT FROM WHERE
GROUP BY / HAVING
• Non-windowed, TUMBLE, HOP, SESSION windows
JOIN
• Windowed INNER, LEFT / RIGHT / FULL OUTER JOIN
• Non-windowed INNER JOIN
Scalar, aggregation, table-valued UDFs
SQL CLI Client (beta)
[streaming only] OVER / WINDOW
• UNBOUNDED / BOUNDED PRECEDING
[batch only] UNION / INTERSECT / EXCEPT / IN / ORDER BY
14
15. What can I build with this?
Data Pipelines
• Transform, aggregate, and move events in real-time
Low-latency ETL
• Convert and write streams to file systems, DBMS, K-V stores, indexes, …
• Convert appearing files into streams
Stream & Batch Analytics
• Run analytical queries over bounded and unbounded data
• Query and compare historic and real-time data
Data Preparation for Live Dashboards
• Compute and update data to visualize in real-time 15
16. The New York Taxi Rides Data Set
The New York City Taxi & Limousine Commission provides a public data
set about taxi rides in New York City
We can derive a streaming table from the data
Table: TaxiRides
rideId: BIGINT // ID of the taxi ride
isStart: BOOLEAN // flag for pick-up (true) or drop-off (false) event
lon: DOUBLE // longitude of pick-up or drop-off location
lat: DOUBLE // latitude of pick-up or drop-off location
rowtime: TIMESTAMP // time of pick-up or drop-off event
16
17. Identify popular pick-up / drop-off locations
SELECT cell,
isStart,
HOP_END(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS hopEnd,
COUNT(*) AS cnt
FROM (SELECT rowtime, isStart, toCellId(lon, lat) AS cell
FROM TaxiRides
WHERE isInNYC(lon, lat))
GROUP BY cell,
isStart,
HOP(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) 17
Compute every 5 minutes for each location the
number of departing and arriving taxis
of the last 15 minutes.
18. Average ride duration per pick-up location
SELECT pickUpCell,
AVG(TIMESTAMPDIFF(MINUTE, e.rowtime, s.rowtime) AS avgDuration
FROM (SELECT rideId, rowtime, toCellId(lon, lat) AS pickUpCell
FROM TaxiRides
WHERE isStart) s
JOIN
(SELECT rideId, rowtime
FROM TaxiRides
WHERE NOT isStart) e
ON s.rideId = e.rideId AND
e.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR
GROUP BY pickUpCell
18
Join start ride and end ride events on rideId and
compute average ride duration per pick-up location.
19. Building a Dashboard
19
Elastic
Search
Kafka
SELECT cell,
isStart,
HOP_END(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE) AS hopEnd,
COUNT(*) AS cnt
FROM (SELECT rowtime, isStart, toCellId(lon, lat) AS cell
FROM TaxiRides
WHERE isInNYC(lon, lat))
GROUP BY cell,
isStart,
HOP(rowtime, INTERVAL '5' MINUTE, INTERVAL '15' MINUTE)
31. UI & Backend services
To make it self-service
• SQL composition & validation
• Connectors management
31
32. UI & Backend services
To make it self-service
• Job compilation and generation
• Resource estimation
32
Analyze input Analyze query
Test
deployment
Kafka input rate
Hive metastore data SELECT * FROM ...
YARN containers
CPU
Heap memory
33. UI & Backend services
To make it self-service
• Job deployment
33
Sandbox
• Functional correctness
• Play around with SQL
Staging
• System generated estimate
• Production like load
Production
• Managed
Promote
34. UI & Backend services
34
To make it self-service
• Job management
35. UI & Backend services
To support Uber scale
• Monitoring and alert automation
• Auto-scaling
• Job recovery
• DC failover
35
Watchdog
40. AthenaX wins
SQL abstraction with Flink
• Non-engineers to use stream processing
E2E Self service
Job from idea to production take minutes/hours
Centralized place to track streaming dataflows
Minimal human intervention, and scale
operationally to ~ 1000 jobs in production
40