A talk discussing the rise of Apache Kafka and data in motion plus the impact of cloud native data systems. This talk will cover how Kafka needs to evolve to keep up with the future of cloud, what this means for distributed systems engineers, and what work is being done to truly make Kafka Cloud Native
10. Databases are fundamentally incomplete.
Databases are designed for siloed, UI-centric applications.
10
Data at rest
Slow, daily
batch processing
Simple, static
real-time queries
Databases
11. A software defined company requires
total connectivity and instant reaction
in real-time.
11
12. PUBLIC CLOUD
With only data at rest, a company is
fragmented and siloed
LINE OF BUSINESS 02
LINE OF BUSINESS 01
12
14. The Paradigm for Data in Motion: Event Streams
Data Management = Storage + Flow
14
Rich front-end
customer experiences
Real-time
Data
Real-time
Stream Processing
Real-time backend
operations
QUERY
A Sale
A shipment
A Trade
A Customer
Experience
18. User Payments
Jay 42
Sue 18
Fred 65
... ...
User
Jay 695
Sue 430
User
Jay 695
Sue 430
Tables Streams
User Credit Score
Jay 695
Sue 430
Fred 710
V1
V3
V2
19. SELECT * FROM
DB_TABLE
CREATE TABLE T
AS SELECT * FROM
EVENT_STREAM
Active Query: Passive Data:
DB Table
Active Data: Passive Query:
Event Stream
Traditional
Database
Stream
Processing
20. Build a complete streaming app
with a few SQL statements
20
Capture
events
Perform
continuous
transformations
Create
materialized
views
Serve lookups
against
materialized
views
1 2 3 4
23. The Rise of Cloud &
Cloud Native Data Services
23
24. Over the next four years $140B+ in IT
spend will move to the cloud.
24
25. Cloud Native Data Systems
Aurora DynamoDB Kinesis
S3 Spanner Snowflake
25
26. What is a Cloud-Native Data System?
Elastic
Usage-based
Cost Model
Infinite
Api-driven
Operations
Secure and
Reliable
Serverless
Global
Multitenant
26
39. For Kafka to Thrive There Must Be
Cloud Native Kafka Services
39
40. The Capabilities of a Cloud-Native Data System
Elastic
Usage-based
Cost Model
Infinite
Api-driven
Operations
Secure and
Reliable
Serverless
Global
Multitenant
40
51. Data Governance meets Data Discovery
Self-service Platform
Security
Data Catalog
Data Lineage
Data Policies
Data Quality
51
52. Confluent Cloud Data Governance
Data Quality
Increase data trust
● Enterprise ready Schema
Registry
● Schemas management UI
● Broker-side schema ID
validation
Data Catalog
Classify, organize, discover
● Search and discover
schemas metadata
● Manage data classifications
● Classify schemas with tags
Data Lineage
Turn data visibility on
● Visualize complex data
in motion pipelines
● Audit data movement
across systems
NOW IN EARLY-ACCESS
52
56. ● Create a global fabric for event
streams that spans the globe
● Span cloud environments and
on premises
● Enables dynamic, API-driven
replication topologies
● Exact offset mirroring between
clusters
● No additional moving parts to
manage or monitor
56
Global: Cluster linking
Keep Data in Sync Globally
to serve regional needs
Currently in Preview
58. Confluent for Kubernetes
Introducing a Declarative, API-driven Control Plane to deploy
and manage Confluent in Private Infrastructures.
AVAILABLE NOW!
Runs on Kubernetes: the infrastructure
runtime for cloud-native architectures.
Declarative API for
operating Confluent
in production
Integrates with
Cloud-Native ecosystem
for Security, Reliability,
DevOps Automation
Manage topics and
RBAC policies through
Infrastructure as Code
58
59. Not Just About One Cloud Service,
Kafka Itself Improves
59