SQL is the lingua franca of data processing and everybody working with data knows SQL. Apache Flink provides SQL support for querying and processing batch and streaming data. Flink’s SQL support powers large-scale production systems at Alibaba, Huawei, and Uber. Based on Flink SQL, these companies have built systems for their internal users as well as publicly offered services for paying customers. In our talk, we will discuss why you should and how you can (not being Alibaba or Uber) leverage the simplicity and power of SQL on Flink. We will start exploring the use cases that Flink SQL was designed for and present real-world problems that it can solve. In particular, you will learn why unified batch and stream processing is important and what it means to run SQL queries on streams of data. After we explored why you should use Flink SQL, we will show how you can leverage its full potential. Since recently, the Flink community is working on a service that integrates a query interface, (external) table catalogs, and result serving functionality for static, appending, and updating result sets. We will discuss the design and feature set of this query service and how it can be used for exploratory batch and streaming queries, ETL pipelines, and live updating query results that serve applications, such as real-time dashboards. The talk concludes with a brief demo of a client running queries against the service.
29. THANK YOU!
@fhueske & @twalthr
@dataArtisans
@ApacheFlink
WE ARE HIRING
data-artisans.com/careers
Notes de l'éditeur
• data Artisans was founded by the original creators of Apache Flink
• We provide dA Platform, a complete stream processing infrastructure with open-source Apache Flink
• Also included is the Application Manager, which turns dA Platform into a self-service platform for stateful stream processing applications.
• dA Platform is generally available, and you can download a free trial today!
• These companies are among many users of Apache Flink, and during this conference you’ll meet folks from some of these companies as well as others using Flink.
• If your company would like to be represented on the “Powered by Apache Flink” page, email me.
Flink offers APIs for different levels of abstraction that all can be mixed and matched.
On the lowest level, ProcessFunctions give precise control about state and time, i.e., when to process data.
The intermediate level, the so-called DataStream API provides higher-level primitives such as for window processing
Finally, on the top level the relational API, SQL and the Table API, are centered around the concept of dynamic tables. This is what I’m going to talk about next.
(Keep this slide up during the Q&A part of your talk. Having this up in the final 5-10 minutes of the session gives the audience something useful to look at.)