Un'introduzione ad Apache Kafka e Kafka Connect APIs (part of Apache Kafka), in particolare come Kafka possa essere usato assieme ad Elasticsearch.
Grazie a Seacom per averci invitato all'evento a Roma.
2. 2
Vision of a streaming enterprise
Elasticsearch
NoSQL
RDBMS Monitoring
MainframesReal-time Analytics Data Warehouse
Apps
Microservices
Hadoop
Streaming Platform
(Powered by Apache Kafka)
3. 3
Confluent and Elastic share similar values
• Distributed and fault tolerant
• Horizontally scalable
• Low latency
• Open source
• Enterprise grade solutions
4. 4
Common Kafka use cases
Data transport and integration
• Log data
• Sensors and device data
• Monitoring streams
• Call data records
• Stock ticker data
• Customer 360
Real-time stream processing
• Monitoring
• Asynchronous applications
• Fraud and security
5. 5
From big data to fast data
Stream data is
The faster the better
Stream data can be
big or fast (Lambda)
Stream data will be
big AND fast (Kappa)
Apache Kafka is the enabling technology of this transition
Big data was
The more the better
ValueofData
Volume of Data
ValueofData
Age of Data Speed Table Batch Table
DB
Streams Hadoop
Job 1 Job 2
Streams
Table 1 Table 2
DB
6. 6
Confluent Platform
Open Source ExternalEnterprise
Confluent Platform
Monitoring
Analytics
Custom Apps
Transformations
Real-time
Applications
…
CRM
Data Warehouse
Database
Hadoop
Data
Integration
…
Control Center
Auto-data
Balancing
Multi-Data
Center Replication
24/7 Support
Supported
Connectors
Clients
Schema
Registry
REST
Proxy
Apache Kafka
Kafka
Connect
Kafka
Streams
Kafka
Core
Database Changes Log Events loT Data Web Events …
7. 7
Apache Kafka API – ETL Analogy
Source SinkConnectAPI
ConnectAPI
Streams API
Extract Transform Load
8. 8
Apache Kafka Connect APIs – Streaming Data Capture
JDBC
Oracle
MySQL
Elastic
Couchbase
HDFS
Kafka Connect API
Kafka Pipeline
Connector
Connector
Connector
Connector
Connector
Connector
Sources Sinks
Fault tolerant
Manage hundreds of
data sources and sinks
Preserves data schema
Part of Apache Kafka
project
Integrated within
Confluent Platform’s
Control Center
9. 9
Confluent Elasticsearch Connector
• Easily move data from
Kafka to Elasticsearch
• Open Source, ASL
licensed
• Key Features:
• Exactly Once Delivery
• Mapping Inference
• Schema Evolution
JDBC
Oracle
MySQL
Elastic
Kafka Connect API
Elasticsearch
Connector
Documentation:
http://docs.confluent.io/current/connect/connect-elasticsearch/docs/elasticsearch_connector.html
Source code:
https://github.com/confluentinc/kafka-connect-elasticsearch
10. 10
Benefits of Kafka Connect APIs: Simple, but Powerful
JDBC
Oracle
Mainframes
Elastic
Kafka Connect API
Elasticsearch Connector
HP Vertica
HDFS
HP Vertica Connector
HDFS Connector
11. 11
Benefits of Kafka Connect APIs: Cross Data Center Replication
Kafka
Kafka Connect API
Confluent Replicator
Kafka Cluster
Data Center A
Data Center B
Low latency, real-time
data replication
14. 14
“A simple, scalable and flexible solution that delivers data in real-time.
Enabling real-time data integration with actionable insights.”
True Partnership Focused on Customers Success
• Enterprise grade distribution of Kafka
• Stream processing at scale
• Simple, reliable, secure and auditable
• Fast and scalable
• Easy to operate
• Enterprise grade security