Kafka Connect and Streams (Concepts, Architecture, Features)
17 Sep 2018•0 j'aime•1,803 vues
Signaler
Technologie
High level introduction to Kafka Connect and Kafka Streams, two components of the Apache Kafka open source framework. See the concepts, architecture and features.
Kafka Connect and Streams (Concepts, Architecture, Features)
1. Kafka Connect and Kafka Streams
The Rise of Apache Kafka as Streaming Platform
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
21. 21
Single Message Transforms
•Mask sensitive information
•Add identifiers
•Tag events
•Lineage/provenance
•Remove unnecessary
columns
•Route high priority events to
faster data stores
•Direct events to different
Elasticsearch indexes
•Cast data types to match
destination
•Remove unnecessary
columns
Modify events before storing in Kafka: Modify events going out of Kafka:
22. 22
• InsertField – Add a field using either static data or record metadata
• ReplaceField – Filter or rename fields
• MaskField – Replace field with valid null value for the type (0, empty string, etc)
• ValueToKey – Set the key to one of the value’s fields
• HoistField – Wrap the entire event as a single field inside a Struct or a Map
• ExtractField – Extract a specific field from Struct and Map and include only this field in results
• SetSchemaMetadata – modify the schema name or version
• TimestampRouter – Modify the topic of a record based on original topic and timestamp. Useful
when using a sink that needs to write to different tables or indexes based on timestamps
• RegexpRouter – modify the topic of a record based on original topic, replacement string and a
regular expression
• „Build your own“ – A Transformation is just a Java Class
Built-in Transformations
37. 37
What if you are NOT a Java Coder?
Population
CodingSophistication
Realm of Stream Processing
New, Expanded Realm
BI
Analysts
Core
Developers
Data
Engineers
Core Developers
who don’t like
Java
Java
KSQL