Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
1
KSQL – Streaming SQL
for Apache Kafka
Matthias J. Sax | Software Engineer
matthias@confluent.io
@MatthiasJSax
2
1.0 Enterprise
Ready
A Brief History of Kafka and Confluent
0.11 Exactly-once
semantics
0.10 Data processing
(Streams AP...
3
Why KSQL?
• Enable Stream Processing for Non-Engineers
• Everybody knows SQL
• Look Ma, no code!
• Declarative stream pr...
4
Trade-Offs
• subscribe()
• poll()
• send()
• flush()
Consumer,
Producer
• filter()
• join()
• aggregate()
Kafka Streams
...
5
Core Concepts KSQL
6
CREATE STREAM clickstream (
time BIGINT,
url VARCHAR,
status INTEGER,
bytes INTEGER,
user_id VARCHAR,
agent VARCHAR)
WIT...
7
Querying Streams
CREATE STREAM user_clicks AS
SELECT *
FROM clickstream
WHERE user_id = 'mjsax';
8
CREATE TABLE clicks AS
SELECT user_id, COUNT(url)
FROM clickstream
WINDOW TUMBLING (size 30 seconds)
GROUP BY user_id
HA...
9
Windowed Aggregations
10
Core Concepts KSQL
Do you think that’s a table you are querying ?
12
CREATE TABLE users (
user_id INTEGER,
registered_at LONG,
username VARCHAR,
name VARCHAR,
city VARCHAR,
level VARCHAR)
...
13
Confluent Schema Registry FTW!
CREATE TABLE users
WITH (
key = 'user_id',
kafka_topic = 'clickstream_users',
value_form...
14
Using Tables
15
CREATE STREAM vip_actions AS
SELECT c.user_id, fullname, url, status
FROM clickstream c
LEFT JOIN users u ON c.user_id ...
16
Joins for Enrichment
17
Stream-Table-Duality
18
How to deploy and use KSQL
19
How to run KSQL
JVM
KSQL Server
KSQL CLI
JVM
KSQL Server
JVM
KSQL Server
Kafka Cluster
#1 Client-server
20
How to run KSQL
#1 Client-server
• Start any number of server nodes
bin/ksql-server-start
• Start one or more CLIs and ...
21
How to run KSQL
JVM
KSQL Server
JVM
KSQL Server
JVM
KSQL Server
#2 as a standalone Application
Kafka Cluster
22
How to run KSQL
#2 as a standalone Application
• Start any number of server nodes
Pass a file of KSQL statement to exec...
23
How to run KSQL
#3 EMBEDDED IN AN APPLICATION
JVM App Instance
KSQL Engine
Application Code
JVM App Instance
KSQL Engin...
24
How to run KSQL
#3 EMBEDDED IN AN APPLICATION
• Embed directly in your Java application
• Generate and execute KSQL que...
25
Internals
26
Internals
Read input from Kafka
Operator DAG:
• filter/map/aggregation/joins
• Operators can be stateful
Write result b...
27
Internals
28
Runtime: Kafka Streams
29
Distributed State
30
Scaling
31
Take home
• Streaming SQL engine for Apache Kafka
• Leverages Kafka Streams
• Open Source: Apache 2.0
• https://github....
32
Thank You
We are hiring!
Prochain SlideShare
Chargement dans…5
×

KSQL---Streaming SQL for Apache Kafka

1 118 vues

Publié le

This talk is about KSQL, an open source streaming SQL engine for Apache Kafka. KSQL aims to make stream processing available to everybody without the need to write Java or Scala code. Streaming SQL makes it easy to get started with a wide-range of stream processing applications such as real-time ETL, sessionization, monitoring and alerting, or fraud detection. We will give a general introduction to KSQL covering its SQL dialect, core concepts, and architecture including some technical deep-dives how it works under the hood.

Publié dans : Logiciels
  • Visit this site: tinyurl.com/sexinarea and find sex in your area for one night)) You can find me on this site too)
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Sex in your area for one night is there tinyurl.com/hotsexinarea Copy and paste link in your browser to visit a site)
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Girls for sex are waiting for you https://bit.ly/2TQ8UAY
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Meetings for sex in your area are there: https://bit.ly/2TQ8UAY
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Our new dating startup, please rate it. We are sure that its the best site for dating. You can visit our site there: https://bit.ly/2SlcOnO
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

KSQL---Streaming SQL for Apache Kafka

  1. 1. 1 KSQL – Streaming SQL for Apache Kafka Matthias J. Sax | Software Engineer matthias@confluent.io @MatthiasJSax
  2. 2. 2 1.0 Enterprise Ready A Brief History of Kafka and Confluent 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication 0.8 2012 2014 Cluster mirroring0.7 2015 2016 20172013 2018 CP 4.1 KSQL GA
  3. 3. 3 Why KSQL? • Enable Stream Processing for Non-Engineers • Everybody knows SQL • Look Ma, no code! • Declarative stream processing language • Fast prototyping • Streaming SQL engine for Apache Kafka • Streaming ETL • Ad-hoc topic inspection
  4. 4. 4 Trade-Offs • subscribe() • poll() • send() • flush() Consumer, Producer • filter() • join() • aggregate() Kafka Streams • Select…from… • Join…where… • Group by.. KSQL Flexibility Simplicity
  5. 5. 5 Core Concepts KSQL
  6. 6. 6 CREATE STREAM clickstream ( time BIGINT, url VARCHAR, status INTEGER, bytes INTEGER, user_id VARCHAR, agent VARCHAR) WITH ( value_format = 'JSON', kafka_topic = 'my_clickstream_topic' ); Creating a Stream
  7. 7. 7 Querying Streams CREATE STREAM user_clicks AS SELECT * FROM clickstream WHERE user_id = 'mjsax';
  8. 8. 8 CREATE TABLE clicks AS SELECT user_id, COUNT(url) FROM clickstream WINDOW TUMBLING (size 30 seconds) GROUP BY user_id HAVING COUNT(url) > 20 WHERE bytes > 1024; Windowed Aggregations
  9. 9. 9 Windowed Aggregations
  10. 10. 10 Core Concepts KSQL
  11. 11. Do you think that’s a table you are querying ?
  12. 12. 12 CREATE TABLE users ( user_id INTEGER, registered_at LONG, username VARCHAR, name VARCHAR, city VARCHAR, level VARCHAR) WITH ( key = 'user_id', kafka_topic = 'clickstream_users', value_format = 'AVRO'); Creating a Table
  13. 13. 13 Confluent Schema Registry FTW! CREATE TABLE users WITH ( key = 'user_id', kafka_topic = 'clickstream_users', value_format = 'AVRO'); Creating a Table
  14. 14. 14 Using Tables
  15. 15. 15 CREATE STREAM vip_actions AS SELECT c.user_id, fullname, url, status FROM clickstream c LEFT JOIN users u ON c.user_id = u.user_id WHERE u.level = 'Platinum'; Joins for Enrichment
  16. 16. 16 Joins for Enrichment
  17. 17. 17 Stream-Table-Duality
  18. 18. 18 How to deploy and use KSQL
  19. 19. 19 How to run KSQL JVM KSQL Server KSQL CLI JVM KSQL Server JVM KSQL Server Kafka Cluster #1 Client-server
  20. 20. 20 How to run KSQL #1 Client-server • Start any number of server nodes bin/ksql-server-start • Start one or more CLIs and point them to a server bin/ksql https://myksqlserver:8090 • All servers share the processing load Technically, instances of the same Kafka Streams Applications Scale up/down without restart
  21. 21. 21 How to run KSQL JVM KSQL Server JVM KSQL Server JVM KSQL Server #2 as a standalone Application Kafka Cluster
  22. 22. 22 How to run KSQL #2 as a standalone Application • Start any number of server nodes Pass a file of KSQL statement to execute bin/ksql-node query-file=foo/bar.sql • Ideal for streaming ETL application deployment Version-control your queries and transformations as code • All running engines share the processing load Technically, instances of the same Kafka Streams Applications Scale up/down without restart
  23. 23. 23 How to run KSQL #3 EMBEDDED IN AN APPLICATION JVM App Instance KSQL Engine Application Code JVM App Instance KSQL Engine Application Code JVM App Instance KSQL Engine Application Code Kafka Cluster
  24. 24. 24 How to run KSQL #3 EMBEDDED IN AN APPLICATION • Embed directly in your Java application • Generate and execute KSQL queries through the Java API Version-control your queries and transformations as code • All running application instances share the processing load Technically, instances of the same Kafka Streams Applications Scale up/down without restart
  25. 25. 25 Internals
  26. 26. 26 Internals Read input from Kafka Operator DAG: • filter/map/aggregation/joins • Operators can be stateful Write result back to Kafka
  27. 27. 27 Internals
  28. 28. 28 Runtime: Kafka Streams
  29. 29. 29 Distributed State
  30. 30. 30 Scaling
  31. 31. 31 Take home • Streaming SQL engine for Apache Kafka • Leverages Kafka Streams • Open Source: Apache 2.0 • https://github.com/confluentinc/ksql/ • Included in Confluent Platform 4.1 • https://www.confluent.io/product/ksql/ • https://docs.confluent.io/current/ksql/docs/index.html
  32. 32. 32 Thank You We are hiring!

×