We present a web service named FLOW to let users do FLink On Web. FLOW aims to minimize the effort of handwriting streaming applications similar in spirit to Hortonworks Stream Analytics Manager, StreamAnalytix, and Nussknacker by letting users drag and drop graphical icons representing streaming operators on GUI.
FLOW builds on Flink Table API and lets users assemble graphical icons associated with not only basic SQL operations but also advanced SQL operations like window aggregation, temporal join, and pattern recognition (MATCH_RECOGNIZE clause). Its data preview function enables to observe how sample data changes before and after applying each operation on screen. In addition, FLOW shows the sample data as time-series charts and geographical maps by interacting with Elasticsearch and Kibana. Therefore, domain experts with basic knowledge of SQL can design their streaming applications easily on GUI without understanding of Flink DataStream API and Flink CEP library.
In this talk, we first present what motivates the development of FLOW, then show how FLOW can be used to figure out the "Popular Places" exercise in its own style, and lastly explain how FLOW leverages Flink Table API.
4. How we used to work
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Domain experts Target systems Requirement
Refinery engineers Expensive refinery equipment
Driving score service
Generating alarms ASAP (in real time)
Generating scores ASAP (in real time)Mobility service planners
5. Needs for real-time stream processing
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
Domain experts
Refinery engineers
Mobility service planners
How to process
real-time stream data?
6. Nonnegligible distance between domain experts and Flink
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
54.6 million
kilometers
384,400
kilometers
Domain experts
Refinery engineers
Mobility service planners
7. Data engineers bridge the gap
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Data engineersDomain experts
Refinery engineers
Mobility service planners
8. Data engineers bridge the gap
FlinkForward 2017
Predictive Maintenance
with Flink .
FlinkForward 2018
Real-time driving score
service using Flink
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Data engineersDomain experts
Refinery engineers
Mobility service planners
Data
Domain knowledge
Requirement
Data
Domain knowledge
Requirement
Training & Transfer
Maintenance
Maintenance
Training & Transfer
Very inefficient!
9. Let domain experts do Flink directly via FLOW
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Various domains of SK
Mobility
IoTTelco
Energy Semiconductor
E-commerce Media
10. FLOW
Abstraction layer to hide
the details of Flink app. development.
Let domain experts do Flink directly via FLOW
Apache Flink
SQL/Table API
DataStream API
Process Function
dynamic tables
streams, windows
events, state, time
JVM languages IDE
Project management
Data visualization
Various domains of SK
Mobility
IoTTelco
Energy Semiconductor
E-commerce Media
Graphical user interface to build
stream processing pipelines of
SQL operations and connectors
Data
Domain knowledge
Requirement
SQL!
SQL!
SQL!
SQL!
11. T A B L E O F C O N T E N T S
Do Flink onWeb with FLOW
1. Motivation
2. Demo – identifying popular places with FLOW
3. Architecture
4. Supported operations & connectors
5. Summary
13. Demo Scenario: Identifying popular places in NYC
Kafka
16 110 6 313 2457891214151718 11
Removing events that do not st
art or end in NewYork City
14. 16 110 3 257891214151718 11
Demo Scenario: Identifying popular places in NYC
Kafka
100m
100m
Mapping coordinates of each re
cord into a grid cell
16 110 3 257891214151718 11
15. Demo Scenario: Identifying popular places in NYC
Kafka
Creating a sliding window of size 10 minutes
that slides by 5 minutes
16 110 3 257891214151718 11
16. Demo Scenario: Identifying popular places in NYC
Kafka
100m
100m
1
2
10
Counting the number of events
in each grid cell per window
16 110 3 257891214151718 11
17. 100m
100m
Demo Scenario: Identifying popular places in NYC
Kafka
1Identifying the cells whose count is
10 or more as a popular place
2
10
16 110 3 257891214151718 11
58. T A B L E O F C O N T E N T S
Do Flink onWeb with FLOW
1. Motivation
2. Demo
3. Architecture – how FLOW interacts with Flink
4. Supported operations & connectors
5. Summary
59. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Spec.
Spec. Schema
Preview
expected to be returned
as response
Preview
Schema
not computed yet
Kafka
source
60. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Spec. Schema
Preview
Schema
Preview
Kafka
source
Spec.
Preview Loaders
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
Kafka
source
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
61. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Kafka
source
Preview Loaders
Spec.
Kafka
source
Schema
Preview
Filter
Spec.
Spec.
but also parent's
schema&preview
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
Preview
Schema
not computed yet
Filter
not only spec
62. LocalStreamEnvironment
(env)
FlinkPreviewLoader
StreamTableEnvironment
(tEnv)
val parentTable = tEnv.registerTable( )
Kafka
source
Schema
Preview
// register all known UDF instances
tEnv.registerFunction("isInNYC", new GeoUtils.IsInNYC())
tEnv.registerFunction("toCellId", new GeoUtils.toCellId())
tEnv.registerFunction("toCoords", new GeoUtils.ToCoords())
www.halloweencostumes.com/adult-piggyback-ride-on-costume.html
FLOW piggybacks on Flink
for schema & preview
computation
val env = StreamExecutionEnvironment.createLocalStreamEnvironment()
val tEnv = StreamTableEnvironment.create(env)FlinkPreview
Loader
Local Flink mini cluster
Schema
Preview
OUT
val table = parentTable.filter("isInNYC(startLon, startLat)&&isInNYC(startLon, startLat)")
Filter
Spec.
// to get the result in this thread
tEnv.toAppendStream(table, Row.class)
.addSink(new CollectSink())
env.execute()
// return and
return (CollectSink.values, table.getSchema())
SchemaPreview
Kafka
source
Schema
Preview
Filter
Spec.
IN
Tables
Functions
parentTable table
isInNYC: (float, float) → boolean
toCellId: (float, float) → int
toCoord: int → (float, float)
63. Kafka
source
Schema
Preview
Schema
Preview
Filter
Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Kafka
source
Preview Loaders
Spec.
Kafka
source
Schema
Preview
Filter
Spec.
Schema
Preview
Preview
Schema
Spec.
Filter
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
64. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Preview
Schema
Spec.
Kafka
source
Preview
Schema
Spec.
Filter
Preview
Schema
Spec.
Select
Preview
Schema
Spec.
Window
Preview
Schema
Spec.
SQL
query
Preview Loaders
Spec.
Parent
Schema
Preview
Child
Spec.
Schema
Preview
Frontend Interface
by Angular.js
KafkaPreview
Loader
Kafka
consumer
Kafka
source
Spec.
IN
Schema
Preview
OUT
FlinkPreview
LoaderParent
Schema
Preview
Child
Spec.
IN
Local Flink minicluster
Schema
Preview
OUT
Parent
Schema
Preview
Schema
Preview
Child
65. Overall architecture of FLOW
RESTfulWeb Server
by Spring
Frontend Interface
by Angular.js
Preview
Schema
Spec.
Kafka
source
Preview
Schema
Spec.
Filter
Preview
Schema
Spec.
Select
Preview
Schema
Spec.
Window
Preview
Schema
Spec.
SQL
query
Preview
Schema
Spec.
ES Sink
Project Generator
by javapoet/freemarker Schema
Spec.
Kafka
source
Schema
Spec.
Filter
Schema
Spec.
Select
Schema
Spec.
Window
Schema
Spec.
SQL
query
Schema
Spec.
ES Sink
Maven project
Flink application
66. T A B L E O F C O N T E N T S
Do Flink onWeb with FLOW
1. Motivation
2. Demo
3. Architecture
4. Supported operations & connectors
5. Summary
We also support Temporal JOIN operation.
First you configure the temporal table,
then the append-only table,
then finally list up expressions like you did in the select operation