SlideShare une entreprise Scribd logo
1  sur  48
Télécharger pour lire hors ligne
1
Architecting Microservices
Applications with Instant Analytics
Tim Berglund, Sr. Director, Developer Experience, Confluent
Rachel Pedreschi, Sr.l Director, Global Field Engineering, Imply Data
2
What the heck is Apache
Kafka and Why Should I Care?
K
V
K
V
K
V
K
V
K
V
K
V
…
…
…
partition 0
partition 1
partition 2
Partitioned Topic
producer
consumer A
…
…
…
partition 0
partition 1
partition 2
Partitioned Topic
consumer A
consumer B
…
…
…
partition 0
partition 1
partition 2
Partitioned Topic
consumer A
consumer B
…
…
…
partition 0
partition 1
partition 2
Partitioned Topic
consumer A
consumer B
…
…
…
partition 0
partition 1
partition 2
Partitioned Topic
consumer A
consumer A
consumer B
…
…
…
partition 0
partition 1
partition 2
Partitioned Topic
consumer A
consumer A
Stream
Processingapps
rdbms
nosql
dwh/
hadoop
Stream
Processingapps
rdbms
nosql
dwh/
hadoop
consumer A
consumer B
…
…
…
partition 0
partition 1
partition 2
Partitioned Topic
consumer A
consumer A
consumer A
consumer A
consumer A
Streams
Application
Streams
Application
Streams
Application
public static void main(String args[]) {
Properties streamsConfiguration = getProperties(SCHEMA_REGISTRY_URL);
final Map<String, String> serdeConfig =
Collections.singletonMap(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG,
SCHEMA_REGISTRY_URL);
final SpecificAvroSerde<Movie> movieSerde = getMovieAvroSerde(serdeConfig);
final SpecificAvroSerde<Rating> ratingSerde = getRatingAvroSerde(serdeConfig);
final SpecificAvroSerde<RatedMovie> ratedMovieSerde = new SpecificAvroSerde<>();
ratingSerde.configure(serdeConfig, false);
StreamsBuilder builder = new StreamsBuilder();
KTable<Long, Double> ratingAverage = getRatingAverageTable(builder);
getRatedMoviesTable(builder, ratingAverage, movieSerde);
Topology topology = builder.build();
KafkaStreams streams = new KafkaStreams(topology, streamsConfiguration);
Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
streams.start();
}
private static SpecificAvroSerde<Rating> getRatingAvroSerde(Map<String, String> serdeConfig) {
final SpecificAvroSerde<Rating> ratingSerde = new SpecificAvroSerde<>();
ratingSerde.configure(serdeConfig, false);
return ratingSerde;
}
public static SpecificAvroSerde<Movie> getMovieAvroSerde(Map<String, String> serdeConfig) {
}
public static SpecificAvroSerde<Movie> getMovieAvroSerde(Map<String, String> serdeConfig) {
final SpecificAvroSerde<Movie> movieSerde = new SpecificAvroSerde<>();
movieSerde.configure(serdeConfig, false);
return movieSerde;
}
public static KTable<Long, String> getRatedMoviesTable(StreamsBuilder builder,
KTable<Long, Double> ratingAverage,
SpecificAvroSerde<Movie> movieSerde) {
builder.stream("raw-movies", Consumed.with(Serdes.Long(), Serdes.String()))
.mapValues(Parser::parseMovie)
.map((key, movie) -> new KeyValue<>(movie.getMovieId(), movie))
.to("movies", Produced.with(Serdes.Long(), movieSerde));
KTable<Long, Movie> movies = builder.table("movies",
Materialized
.<Long, Movie, KeyValueStore<Bytes, byte[]>>as(
"movies-store")
.withValueSerde(movieSerde)
.withKeySerde(Serdes.Long())
);
KTable<Long, String> ratedMovies = ratingAverage
.join(movies, (avg, movie) -> movie.getTitle() + "=" + avg);
ratedMovies.toStream().to("rated-movies", Produced.with(Serdes.Long(), Serdes.String()));
return ratedMovies;
}
public static KTable<Long, Double> getRatingAverageTable(StreamsBuilder builder) {
KStream<Long, String> rawRatings = builder.stream("raw-ratings",
Consumed.with(Serdes.Long(),
Serdes.String()));
KStream<Long, Rating> ratings = rawRatings.mapValues(Parser::parseRating)
return ratedMovies;
}
public static KTable<Long, Double> getRatingAverageTable(StreamsBuilder builder) {
KStream<Long, String> rawRatings = builder.stream("raw-ratings",
Consumed.with(Serdes.Long(),
Serdes.String()));
KStream<Long, Rating> ratings = rawRatings.mapValues(Parser::parseRating)
.map((key, rating) -> new KeyValue<>(rating.getMovieId(), rating));
KStream<Long, Double> numericRatings = ratings.mapValues(Rating::getRating);
KGroupedStream<Long, Double> ratingsById = numericRatings.groupByKey();
KTable<Long, Long> ratingCounts = ratingsById.count();
KTable<Long, Double> ratingSums = ratingsById.reduce((v1, v2) -> v1 + v2);
KTable<Long, Double> ratingAverage = ratingSums.join(ratingCounts,
(sum, count) -> sum / count.doubleValue(),
Materialized.as("average-ratings"));
ratingAverage.toStream()
/*.peek((key, value) -> { // debug only
System.out.println("key = " + key + ", value = " + value);
})*/
.to("average-ratings");
return ratingAverage;
}
CREATE TABLE movie_ratings AS
SELECT title,
SUM(rating)/COUNT(rating) AS avg_rating,
COUNT(rating) AS num_ratings
FROM ratings
LEFT OUTER JOIN movies
ON ratings.movie_id = movies.movie_id
GROUP BY title;
producer
consumer
KSQL Cluster
KSQL
Server
KSQL
Server
orders shipping users
warehouse
order web
UI
users web
UI
orders shipping users
warehouse
order web
UI
users web
UI
druid
product
analytics
27
What the heck is Apache
Druid and Why Should I Care?
28
TimRachel
29
30
31
32
Data
Data
Data
Data Sources
ETL Data
Warehouse
Some Code Usually an RDBMS
Analytics
Reporting
Data mining
Querying
33
34
35
Data
Data
Data
Map/reduce
Reporting and Analytics
ELT
Data
Warehouse
ML/AI Engine
Search system
Data
Lake
HDFSRDBMS / NoSQL
36
37
Data
Data
Data
Data Sources
Message bus
Data
Lake
Streaming OLAP
38
39
40
Typical
Big Data++
Challenges
● Scale: when data is large, we need a lot of servers
● Speed: aiming for sub-second response time
● Complexity: too much fine grain to precompute
● High dimensionality: 10s or 100s of dimensions
● Concurrency: many users and tenants
● Freshness: load from streams
41
Search
platform
OLAP
! Real-time ingestion
! Flexible schema
! Full text search
! Batch ingestion
! Efficient storage
! Fast analytic queries
Timeseries
database
! Optimized storage for
time-based datasets
! Time-based functions
42
! Batch ingestion
! Efficient storage
! Fast analytic queries
Search
platform
OLAP
! Real-time ingestion
! Flexible schema
! Full text search
Timeseries
database
! Optimized storage for
time-based datasets
! Time-based functions
high performance
analytics database for
event-driven data
43
Druid Use Cases
in the Wild
1. Digital Advertising - Publishers, Advertisers,
Exchanges
2. User Event Analytics- Clickstream, QoS, Usage
3. Network Telemetry
4. Lots and Lots of Data- IoT, Product Analytics,
Fraud
44
Gratuitous
Customer Quote “The performance is great ... some of the tables
that we have internally in Druid have billions and
billions of events in them, and we’re scanning
them in under a second.”
Source: https://www.infoworld.com/article/2949168/hadoop/yahoo-struts-
its-hadoop-stuff.html
orders shipping users
warehouse
order web
UI
users web
UI
druid
product
analytics
46
47
You can Druid
too! Druid community site: https://druid.apache.org/
Imply distribution: https://imply.io/get-started
Contribute: https://github.com/apache/druid
48

Contenu connexe

Tendances

Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Guido Schmutz
 
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
Bridge to Cloud: Using Apache Kafka to Migrate to AWSBridge to Cloud: Using Apache Kafka to Migrate to AWS
Bridge to Cloud: Using Apache Kafka to Migrate to AWSconfluent
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...confluent
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Servicesconfluent
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshopconfluent
 
Building Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaBuilding Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaGuido Schmutz
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafkaconfluent
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Building Event-Driven Applications with Apache Kafka & Confluent Platform
Building Event-Driven Applications with Apache Kafka & Confluent PlatformBuilding Event-Driven Applications with Apache Kafka & Confluent Platform
Building Event-Driven Applications with Apache Kafka & Confluent Platformconfluent
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseGuido Schmutz
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalconfluent
 
Stream me to the Cloud (and back) with Confluent & MongoDB
Stream me to the Cloud (and back) with Confluent & MongoDBStream me to the Cloud (and back) with Confluent & MongoDB
Stream me to the Cloud (and back) with Confluent & MongoDBconfluent
 
Top use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache KafkaTop use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache Kafkaconfluent
 
A guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceA guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceEldert Grootenboer
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 

Tendances (20)

Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka
 
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
Bridge to Cloud: Using Apache Kafka to Migrate to AWSBridge to Cloud: Using Apache Kafka to Migrate to AWS
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
 
APAC ksqlDB Workshop
APAC ksqlDB WorkshopAPAC ksqlDB Workshop
APAC ksqlDB Workshop
 
Building Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache KafkaBuilding Event-Driven (Micro) Services with Apache Kafka
Building Event-Driven (Micro) Services with Apache Kafka
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Building Event-Driven Applications with Apache Kafka & Confluent Platform
Building Event-Driven Applications with Apache Kafka & Confluent PlatformBuilding Event-Driven Applications with Apache Kafka & Confluent Platform
Building Event-Driven Applications with Apache Kafka & Confluent Platform
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph Database
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
 
Stream me to the Cloud (and back) with Confluent & MongoDB
Stream me to the Cloud (and back) with Confluent & MongoDBStream me to the Cloud (and back) with Confluent & MongoDB
Stream me to the Cloud (and back) with Confluent & MongoDB
 
Top use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache KafkaTop use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache Kafka
 
A guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update ConferenceA guide through the Azure Messaging services - Update Conference
A guide through the Azure Messaging services - Update Conference
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 

Similaire à Architecting Microservices Applications with Instant Analytics

Kubernetes Controller for Pull Request Based Environment
Kubernetes Controller for Pull Request Based EnvironmentKubernetes Controller for Pull Request Based Environment
Kubernetes Controller for Pull Request Based EnvironmentVishal Banthia
 
Lessons from running AppSync in prod
Lessons from running AppSync in prodLessons from running AppSync in prod
Lessons from running AppSync in prodYan Cui
 
Tech Webinar: Angular 2, Introduction to a new framework
Tech Webinar: Angular 2, Introduction to a new frameworkTech Webinar: Angular 2, Introduction to a new framework
Tech Webinar: Angular 2, Introduction to a new frameworkCodemotion
 
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...NLJUG
 
The Very Very Latest in Database Development - Oracle Open World 2012
The Very Very Latest in Database Development - Oracle Open World 2012The Very Very Latest in Database Development - Oracle Open World 2012
The Very Very Latest in Database Development - Oracle Open World 2012Lucas Jellema
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)confluent
 
Make streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLMake streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLDataWorks Summit
 
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...Christopher Diamantopoulos
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...Lightbend
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 
Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017Idit Levine
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVMJohn Lee
 
Introduction to Microservices by Jim Tran, Principal Solutions Architect, AWS
Introduction to Microservices by Jim Tran, Principal Solutions Architect, AWSIntroduction to Microservices by Jim Tran, Principal Solutions Architect, AWS
Introduction to Microservices by Jim Tran, Principal Solutions Architect, AWSAmazon Web Services
 
Big datadc skyfall_preso_v2
Big datadc skyfall_preso_v2Big datadc skyfall_preso_v2
Big datadc skyfall_preso_v2abramsm
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiKim Kao
 
Tackle Containerization Advisor (TCA) for Legacy Applications
Tackle Containerization Advisor (TCA) for Legacy ApplicationsTackle Containerization Advisor (TCA) for Legacy Applications
Tackle Containerization Advisor (TCA) for Legacy ApplicationsKonveyor Community
 
Microservices on AWS: Divide & Conquer for Agility and Scalability
Microservices on AWS: Divide & Conquer for Agility and ScalabilityMicroservices on AWS: Divide & Conquer for Agility and Scalability
Microservices on AWS: Divide & Conquer for Agility and ScalabilityAmazon Web Services
 

Similaire à Architecting Microservices Applications with Instant Analytics (20)

Kubernetes Controller for Pull Request Based Environment
Kubernetes Controller for Pull Request Based EnvironmentKubernetes Controller for Pull Request Based Environment
Kubernetes Controller for Pull Request Based Environment
 
Lessons from running AppSync in prod
Lessons from running AppSync in prodLessons from running AppSync in prod
Lessons from running AppSync in prod
 
Tech Webinar: Angular 2, Introduction to a new framework
Tech Webinar: Angular 2, Introduction to a new frameworkTech Webinar: Angular 2, Introduction to a new framework
Tech Webinar: Angular 2, Introduction to a new framework
 
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
Modularity and Domain Driven Design; a killer Combination? - Tom de Wolf & St...
 
ql.io at NodePDX
ql.io at NodePDXql.io at NodePDX
ql.io at NodePDX
 
The Very Very Latest in Database Development - Oracle Open World 2012
The Very Very Latest in Database Development - Oracle Open World 2012The Very Very Latest in Database Development - Oracle Open World 2012
The Very Very Latest in Database Development - Oracle Open World 2012
 
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
The Very Very Latest In Database Development - Lucas Jellema - Oracle OpenWor...
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
 
Make streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLMake streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQL
 
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017Debugging Microservices - QCON 2017
Debugging Microservices - QCON 2017
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
Big Data Tools in AWS
Big Data Tools in AWSBig Data Tools in AWS
Big Data Tools in AWS
 
Introduction to Microservices by Jim Tran, Principal Solutions Architect, AWS
Introduction to Microservices by Jim Tran, Principal Solutions Architect, AWSIntroduction to Microservices by Jim Tran, Principal Solutions Architect, AWS
Introduction to Microservices by Jim Tran, Principal Solutions Architect, AWS
 
Big datadc skyfall_preso_v2
Big datadc skyfall_preso_v2Big datadc skyfall_preso_v2
Big datadc skyfall_preso_v2
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
 
Tackle Containerization Advisor (TCA) for Legacy Applications
Tackle Containerization Advisor (TCA) for Legacy ApplicationsTackle Containerization Advisor (TCA) for Legacy Applications
Tackle Containerization Advisor (TCA) for Legacy Applications
 
Microservices on AWS: Divide & Conquer for Agility and Scalability
Microservices on AWS: Divide & Conquer for Agility and ScalabilityMicroservices on AWS: Divide & Conquer for Agility and Scalability
Microservices on AWS: Divide & Conquer for Agility and Scalability
 

Plus de confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

Plus de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Dernier

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Dernier (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Architecting Microservices Applications with Instant Analytics

  • 1. 1 Architecting Microservices Applications with Instant Analytics Tim Berglund, Sr. Director, Developer Experience, Confluent Rachel Pedreschi, Sr.l Director, Global Field Engineering, Imply Data
  • 2. 2 What the heck is Apache Kafka and Why Should I Care?
  • 3. K V
  • 4. K V
  • 5. K V
  • 6. K V
  • 7. K V
  • 8. K V
  • 9. … … … partition 0 partition 1 partition 2 Partitioned Topic producer
  • 10. consumer A … … … partition 0 partition 1 partition 2 Partitioned Topic
  • 11. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic
  • 12. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic
  • 13. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A
  • 14. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A consumer A
  • 17. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A consumer A
  • 20. public static void main(String args[]) { Properties streamsConfiguration = getProperties(SCHEMA_REGISTRY_URL); final Map<String, String> serdeConfig = Collections.singletonMap(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, SCHEMA_REGISTRY_URL); final SpecificAvroSerde<Movie> movieSerde = getMovieAvroSerde(serdeConfig); final SpecificAvroSerde<Rating> ratingSerde = getRatingAvroSerde(serdeConfig); final SpecificAvroSerde<RatedMovie> ratedMovieSerde = new SpecificAvroSerde<>(); ratingSerde.configure(serdeConfig, false); StreamsBuilder builder = new StreamsBuilder(); KTable<Long, Double> ratingAverage = getRatingAverageTable(builder); getRatedMoviesTable(builder, ratingAverage, movieSerde); Topology topology = builder.build(); KafkaStreams streams = new KafkaStreams(topology, streamsConfiguration); Runtime.getRuntime().addShutdownHook(new Thread(streams::close)); streams.start(); } private static SpecificAvroSerde<Rating> getRatingAvroSerde(Map<String, String> serdeConfig) { final SpecificAvroSerde<Rating> ratingSerde = new SpecificAvroSerde<>(); ratingSerde.configure(serdeConfig, false); return ratingSerde; } public static SpecificAvroSerde<Movie> getMovieAvroSerde(Map<String, String> serdeConfig) {
  • 21. } public static SpecificAvroSerde<Movie> getMovieAvroSerde(Map<String, String> serdeConfig) { final SpecificAvroSerde<Movie> movieSerde = new SpecificAvroSerde<>(); movieSerde.configure(serdeConfig, false); return movieSerde; } public static KTable<Long, String> getRatedMoviesTable(StreamsBuilder builder, KTable<Long, Double> ratingAverage, SpecificAvroSerde<Movie> movieSerde) { builder.stream("raw-movies", Consumed.with(Serdes.Long(), Serdes.String())) .mapValues(Parser::parseMovie) .map((key, movie) -> new KeyValue<>(movie.getMovieId(), movie)) .to("movies", Produced.with(Serdes.Long(), movieSerde)); KTable<Long, Movie> movies = builder.table("movies", Materialized .<Long, Movie, KeyValueStore<Bytes, byte[]>>as( "movies-store") .withValueSerde(movieSerde) .withKeySerde(Serdes.Long()) ); KTable<Long, String> ratedMovies = ratingAverage .join(movies, (avg, movie) -> movie.getTitle() + "=" + avg); ratedMovies.toStream().to("rated-movies", Produced.with(Serdes.Long(), Serdes.String())); return ratedMovies; } public static KTable<Long, Double> getRatingAverageTable(StreamsBuilder builder) { KStream<Long, String> rawRatings = builder.stream("raw-ratings", Consumed.with(Serdes.Long(), Serdes.String())); KStream<Long, Rating> ratings = rawRatings.mapValues(Parser::parseRating)
  • 22. return ratedMovies; } public static KTable<Long, Double> getRatingAverageTable(StreamsBuilder builder) { KStream<Long, String> rawRatings = builder.stream("raw-ratings", Consumed.with(Serdes.Long(), Serdes.String())); KStream<Long, Rating> ratings = rawRatings.mapValues(Parser::parseRating) .map((key, rating) -> new KeyValue<>(rating.getMovieId(), rating)); KStream<Long, Double> numericRatings = ratings.mapValues(Rating::getRating); KGroupedStream<Long, Double> ratingsById = numericRatings.groupByKey(); KTable<Long, Long> ratingCounts = ratingsById.count(); KTable<Long, Double> ratingSums = ratingsById.reduce((v1, v2) -> v1 + v2); KTable<Long, Double> ratingAverage = ratingSums.join(ratingCounts, (sum, count) -> sum / count.doubleValue(), Materialized.as("average-ratings")); ratingAverage.toStream() /*.peek((key, value) -> { // debug only System.out.println("key = " + key + ", value = " + value); })*/ .to("average-ratings"); return ratingAverage; }
  • 23. CREATE TABLE movie_ratings AS SELECT title, SUM(rating)/COUNT(rating) AS avg_rating, COUNT(rating) AS num_ratings FROM ratings LEFT OUTER JOIN movies ON ratings.movie_id = movies.movie_id GROUP BY title;
  • 26. orders shipping users warehouse order web UI users web UI druid product analytics
  • 27. 27 What the heck is Apache Druid and Why Should I Care?
  • 29. 29
  • 30. 30
  • 31. 31
  • 32. 32 Data Data Data Data Sources ETL Data Warehouse Some Code Usually an RDBMS Analytics Reporting Data mining Querying
  • 33. 33
  • 34. 34
  • 35. 35 Data Data Data Map/reduce Reporting and Analytics ELT Data Warehouse ML/AI Engine Search system Data Lake HDFSRDBMS / NoSQL
  • 36. 36
  • 38. 38
  • 39. 39
  • 40. 40 Typical Big Data++ Challenges ● Scale: when data is large, we need a lot of servers ● Speed: aiming for sub-second response time ● Complexity: too much fine grain to precompute ● High dimensionality: 10s or 100s of dimensions ● Concurrency: many users and tenants ● Freshness: load from streams
  • 41. 41 Search platform OLAP ! Real-time ingestion ! Flexible schema ! Full text search ! Batch ingestion ! Efficient storage ! Fast analytic queries Timeseries database ! Optimized storage for time-based datasets ! Time-based functions
  • 42. 42 ! Batch ingestion ! Efficient storage ! Fast analytic queries Search platform OLAP ! Real-time ingestion ! Flexible schema ! Full text search Timeseries database ! Optimized storage for time-based datasets ! Time-based functions high performance analytics database for event-driven data
  • 43. 43 Druid Use Cases in the Wild 1. Digital Advertising - Publishers, Advertisers, Exchanges 2. User Event Analytics- Clickstream, QoS, Usage 3. Network Telemetry 4. Lots and Lots of Data- IoT, Product Analytics, Fraud
  • 44. 44 Gratuitous Customer Quote “The performance is great ... some of the tables that we have internally in Druid have billions and billions of events in them, and we’re scanning them in under a second.” Source: https://www.infoworld.com/article/2949168/hadoop/yahoo-struts- its-hadoop-stuff.html
  • 45. orders shipping users warehouse order web UI users web UI druid product analytics
  • 46. 46
  • 47. 47 You can Druid too! Druid community site: https://druid.apache.org/ Imply distribution: https://imply.io/get-started Contribute: https://github.com/apache/druid
  • 48. 48