SlideShare une entreprise Scribd logo
1  sur  31
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Chris Horder – AWS Solutions Architect
3rd July 2019
Stream processing in 2019
Typical credit card transaction
Credit Card
Institution
User
Merchant Mobile
client
Mobile Apps Web Clickstream Application Logs
Metering Records IoT Sensors Smart Cities
[Wed Oct 11 14:32:52
2018] [error] [client
127.0.0.1] client
denied by server
configuration:
/export/home/live/ap/h
tdocs/test
Data is produced continuously
The diminishing value of data over time
Historical
Reactive
Actionable
Preventive/
Predictive
Informationin
Decision-Making
Time-critical
Decisions
Traditional“Batch”
BusinessIntelligence
Months…DaysHoursMinutesSecondsRealTime
ValueofDatatoDecision-Making
Customer examples
1 billion events per
week from
connected devices
Near-real-time
home valuation
(Zestimates)
Live clickstream
dashboards refreshed
under 10 sec.
50 billion daily ad
impressions, sub-
50-ms responses
Facilitate
communications
between 100+
microservices
Analyse billions of
network flows in
real time
IoT predictive
analytics
Analyse game
events in near
real time
What do we need from our streaming architecture
Event driven
Source for
batch
Consumption models
Schema flexibility
Microservices
Loosely coupled
Elasticity
Horizontal
scalability
Operations
Components of a streaming architecture
Producer Message Buffer
Topic A
Topic B
Consumer
Producer
Producer
Producer
Producer
Producer
Schema
Repository
Consumer
Consumer
Consumer
Consumer
Checkpoint: Amazon SQS, Amazon Kinesis and
Amazon Managed Streaming for Kafka
• Traditional Messaging
semantics
• Transparent scaling
• Individual message
delay
• Dead letter queues
• Multiple Consumers
• Native AWS
Integrations
• Fully Managed
• Control over ordering
• Highly configurable
retention
• Managed Kafka and
Zookeeper
• Existing applications
• Full configurability
• Log compaction
Amazon Simple
Queue Service
Amazon
Kinesis
Amazon Managed
Streaming for Kafka
Kafka on Elastic Compute
Cloud (EC2)
Producers
Producers: AWS native
AWS Database Migration
Service
Kinesis Producer Library
Kinesis Agent
AWS IoT Amazon Connect
Amazon Pinpoint
Amazon DynamoDB
Streams
AWS Tools and SDKs
Amazon API Gateway
Amazon EMR
Producers: Partners and SDKs
Brokers
Real-time
Fully managed
Scalable
Secure
Cost effective
Amazon EMR/Spark
Custom code on
Amazon EC2
Amazon S3
Amazon
Redshift
Splunk
Ingest,
store data
streams
Amazon
Kinesis Data
Streams
Amazon
Kinesis Data
Analytics
Aggregate,
filter,
enrich data
Amazon
Kinesis Data
Firehose
Egress
data
streams
AWS Lambda
Amazon
Elasticsearch
Service
Amazon Kinesis Data Streaming
Collect Process and analyse data streams in real-time
Our Broker: Apache Kafka
Apache Kafka anatomy 101
Producer
Broker
Broker
Broker
Consumer
Cluster
Zookeeper
Producer
Consumers
Consumers
Kinesis Consumer Library
Kinesis Agent
AWS Lambda
Amazon EMR Amazon
Kinesis Data
Analytics
Amazon
Kinesis Data
Firehose
SDK’s
Apache Flink
Framework and distributed engine for stateful processing of data streams.
Simple programming
High
performance
Stateful Processing Strong data integrity
Easy to use and flexible
APIs make building
apps fast
In-memory computing
provides low latency &
high throughput
Durable application
state saves
Exactly-once processing
and consistent state
Getting started
Amazon Managed
Streaming for Kafka
Amazon Elasticsearch
Service
{
"accountId": "11185",
"transactionId": 1155776,
"isDetailAvailable": "true",
"type": ”PAYMENT",
"status": ”POSTED",
"description": "Credit card payment to Sample Shop",
"postingDateTime": 1553207924271,
"valueDateTime": "22-03-2019 09:38:44",
"executionDateTime": 1553207924271,
"amount": 767.8318,
"currency": "AUD",
"reference": "AKG62DHBB5",
"merchantName": ”Sample shop : Purchase",
"merchantCategoryCode": "4112"
}`
Getting started: Infrastructure
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Instance
Broker
Instance
Broker
JMeter
Node Node
Availability Zone 3
Instance
Broker
Node
Amazon
ECS for
Kubernetes
Amazon
Managed
Streaming for
Kafka
Sample data
{
"accountId": "11158",
"transactionId": “126649”,
"isDetailAvailable": true,
"type": "PAYMENT",
"status": "POSTED",
"description": "Credit card payment to AccountY",
"postingDateTime": "09-04-2019 17:17:44",
"valueDateTime": "09-04-2019 17:17:44",
"executionDateTime": "09-04-2019 17:17:44",
"amount": 534.24,
"currency": ”AUD",
"reference": "BF6GCJ2EDE",
"merchantName": ”MerchantA",
"merchantCategoryCode": "3150"
}
Demo
Flink console
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SQL over the stream
Table result1 =
tableEnv.sqlQuery("select transactionId," +
" amount, " +
" accountId, " +
" currency, " +
" type " +
"from Transactions " +
" where currency = 'BAD' and amount > 995 and type = 'TRANSFER_OUTGOING'"
);
• Familiar syntax
• Joins across streams
• Aggregations
Aggregations and windowing
Sliding
select TUMBLE_START(created_date, INTERVAL '20' SECOND) as wStart,
TUMBLE_END(created_date, INTERVAL '20' SECOND) as wEnd,
SUM(amount),
AVG(amount),
COUNT(*)
from Transactions
GROUP BY TUMBLE(created_date, INTERVAL '20' SECOND)
Tumbling
Session
Categorise transactions from a stream
Amazon Managed
Streaming for Kafka
Mobile
client
Users
Amazon Simple
Notification Service
Amazon API GatewayAmazon DynamoDB
Amazon SageMaker Amazon Managed
Streaming for Kafka
Grow with your requirements
Amazon Managed
Streaming for Kafka
Mobile
client
Amazon API Gateway
Amazon Simple Storage
Service (S3)
Amazon Athena
Amazon RedshiftAmazon Elasticsearch
Service
DesktopAmazon RDS
AWS
Lambda
Amazon Neptune
Amazon EC2
Analyst
Grow with your requirements
Amazon Managed
Streaming for Kafka
Mobile
client
Amazon API Gateway
Amazon Simple Storage
Service (S3)
Amazon Athena
Amazon RedshiftAmazon Elasticsearch
Service
DesktopAmazon RDS
AWS
Lambda
Amazon Neptune
Amazon EC2
Analyst
Resources
Data61 - CSIRO
• Standards developed as part of the introduction in Australia of the Consumer Data
Right legislation to give Australians greater control over their data
• The Consumer Data Right is intended to apply sector by sector across the whole
economy, beginning in the banking, energy and telecommunications sectors.
• https://consumerdatastandardsaustralia.github.io
Resources
• Free online training - https://www.aws.training
• Flink Homepage - https://flink.apache.org/
• AWS Big Data Blog
• https://aws.amazon.com/blogs/big-data/
• Real-time bushfire alerting with Complex Event Processing in Apache Flink on Amazon EMR and IoT
sensor network
• Confluent - https://www.confluent.io/blog/
• Designing Data-Intensive Applications by Martin Kleppmann

Contenu connexe

Tendances

Mainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache KafkaMainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache KafkaKai Wähner
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...Kai Wähner
 
Top use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache KafkaTop use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache Kafkaconfluent
 
Apache Kafka in the Insurance Industry
Apache Kafka in the Insurance IndustryApache Kafka in the Insurance Industry
Apache Kafka in the Insurance IndustryKai Wähner
 
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...InfluxData
 
Supply Chain Optimization with Apache Kafka
Supply Chain Optimization with Apache KafkaSupply Chain Optimization with Apache Kafka
Supply Chain Optimization with Apache KafkaKai Wähner
 
JUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingJUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingNicolas Fränkel
 
App modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent CloudApp modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent CloudKai Wähner
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinarconfluent
 
Apache Kafka and Blockchain - Comparison and a Kafka-native Implementation
Apache Kafka and Blockchain - Comparison and a Kafka-native ImplementationApache Kafka and Blockchain - Comparison and a Kafka-native Implementation
Apache Kafka and Blockchain - Comparison and a Kafka-native ImplementationKai Wähner
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Kai Wähner
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKai Wähner
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesKai Wähner
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesKai Wähner
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertconfluent
 
Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...KafkaZone
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Kai Wähner
 

Tendances (20)

Mainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache KafkaMainframe Integration, Offloading and Replacement with Apache Kafka
Mainframe Integration, Offloading and Replacement with Apache Kafka
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
 
Top use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache KafkaTop use cases for 2022 with Data in Motion and Apache Kafka
Top use cases for 2022 with Data in Motion and Apache Kafka
 
Apache Kafka in the Insurance Industry
Apache Kafka in the Insurance IndustryApache Kafka in the Insurance Industry
Apache Kafka in the Insurance Industry
 
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...
Kai Waehner [Confluent] | Real-Time Streaming Analytics with 100,000 Cars Usi...
 
Supply Chain Optimization with Apache Kafka
Supply Chain Optimization with Apache KafkaSupply Chain Optimization with Apache Kafka
Supply Chain Optimization with Apache Kafka
 
JUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingJUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streaming
 
App modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent CloudApp modernization on AWS with Apache Kafka and Confluent Cloud
App modernization on AWS with Apache Kafka and Confluent Cloud
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinar
 
Apache Kafka and Blockchain - Comparison and a Kafka-native Implementation
Apache Kafka and Blockchain - Comparison and a Kafka-native ImplementationApache Kafka and Blockchain - Comparison and a Kafka-native Implementation
Apache Kafka and Blockchain - Comparison and a Kafka-native Implementation
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
 
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud ArchitecturesUnleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
Unleashing Apache Kafka and TensorFlow in Hybrid Cloud Architectures
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
 
Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
Apache Kafka in the Automotive Industry (Connected Vehicles, Manufacturing 4....
 

Similaire à Stream Processing in 2019

Processamento em tempo real usando AWS - padrões e casos de uso
Processamento em tempo real usando AWS - padrões e casos de usoProcessamento em tempo real usando AWS - padrões e casos de uso
Processamento em tempo real usando AWS - padrões e casos de usoAmazon Web Services LATAM
 
Building Serverless EDA w_ AWS Lambda (1).pptx
Building Serverless EDA w_ AWS Lambda (1).pptxBuilding Serverless EDA w_ AWS Lambda (1).pptx
Building Serverless EDA w_ AWS Lambda (1).pptxAhmed791434
 
So You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless ComputingSo You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless ComputingAmazon Web Services
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Amazon Web Services
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisAmazon Web Services
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Web Services
 
Em tempo real: Ingestão, processamento e analise de dados
Em tempo real: Ingestão, processamento e analise de dadosEm tempo real: Ingestão, processamento e analise de dados
Em tempo real: Ingestão, processamento e analise de dadosAmazon Web Services LATAM
 
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Amazon Web Services
 
Path to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realPath to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realAmazon Web Services LATAM
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAmazon Web Services
 
Intro To Serverless Application Architecture: Collision 2018
Intro To Serverless Application Architecture: Collision 2018Intro To Serverless Application Architecture: Collision 2018
Intro To Serverless Application Architecture: Collision 2018Amazon Web Services
 
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...Amazon Web Services
 
Getting started with aws io t.compressed.compressed
Getting started with aws io t.compressed.compressedGetting started with aws io t.compressed.compressed
Getting started with aws io t.compressed.compressedAmazon Web Services
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsAmazon Web Services
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon KinesisAmazon Web Services
 
AWSome Day Manchester 2105 - Intro/Close
AWSome Day Manchester 2105 - Intro/CloseAWSome Day Manchester 2105 - Intro/Close
AWSome Day Manchester 2105 - Intro/CloseIan Massingham
 
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKChoose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKSungmin Kim
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAmazon Web Services
 
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...Amazon Web Services
 
AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...
AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...
AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...Amazon Web Services
 

Similaire à Stream Processing in 2019 (20)

Processamento em tempo real usando AWS - padrões e casos de uso
Processamento em tempo real usando AWS - padrões e casos de usoProcessamento em tempo real usando AWS - padrões e casos de uso
Processamento em tempo real usando AWS - padrões e casos de uso
 
Building Serverless EDA w_ AWS Lambda (1).pptx
Building Serverless EDA w_ AWS Lambda (1).pptxBuilding Serverless EDA w_ AWS Lambda (1).pptx
Building Serverless EDA w_ AWS Lambda (1).pptx
 
So You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless ComputingSo You Think You're an AWS Master aka Serverless Computing
So You Think You're an AWS Master aka Serverless Computing
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
 
Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
 
Em tempo real: Ingestão, processamento e analise de dados
Em tempo real: Ingestão, processamento e analise de dadosEm tempo real: Ingestão, processamento e analise de dados
Em tempo real: Ingestão, processamento e analise de dados
 
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
 
Path to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realPath to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo real
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon Kinesis
 
Intro To Serverless Application Architecture: Collision 2018
Intro To Serverless Application Architecture: Collision 2018Intro To Serverless Application Architecture: Collision 2018
Intro To Serverless Application Architecture: Collision 2018
 
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
GPSTEC313_GPS Real-Time Data Processing with AWS Lambda Quickly, at Scale, an...
 
Getting started with aws io t.compressed.compressed
Getting started with aws io t.compressed.compressedGetting started with aws io t.compressed.compressed
Getting started with aws io t.compressed.compressed
 
Deep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming ApplicationsDeep Dive and Best Practices for Real Time Streaming Applications
Deep Dive and Best Practices for Real Time Streaming Applications
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
 
AWSome Day Manchester 2105 - Intro/Close
AWSome Day Manchester 2105 - Intro/CloseAWSome Day Manchester 2105 - Intro/Close
AWSome Day Manchester 2105 - Intro/Close
 
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSKChoose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
Choose Right Stream Storage: Amazon Kinesis Data Streams vs MSK
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
 
AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...
AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...
AWS March 2016 Webinar Series - AWS IoT Real Time Stream Processing with AWS ...
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Stream Processing in 2019

  • 1. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Chris Horder – AWS Solutions Architect 3rd July 2019 Stream processing in 2019
  • 2. Typical credit card transaction Credit Card Institution User Merchant Mobile client
  • 3. Mobile Apps Web Clickstream Application Logs Metering Records IoT Sensors Smart Cities [Wed Oct 11 14:32:52 2018] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/h tdocs/test Data is produced continuously
  • 4. The diminishing value of data over time Historical Reactive Actionable Preventive/ Predictive Informationin Decision-Making Time-critical Decisions Traditional“Batch” BusinessIntelligence Months…DaysHoursMinutesSecondsRealTime ValueofDatatoDecision-Making
  • 5. Customer examples 1 billion events per week from connected devices Near-real-time home valuation (Zestimates) Live clickstream dashboards refreshed under 10 sec. 50 billion daily ad impressions, sub- 50-ms responses Facilitate communications between 100+ microservices Analyse billions of network flows in real time IoT predictive analytics Analyse game events in near real time
  • 6. What do we need from our streaming architecture Event driven Source for batch Consumption models Schema flexibility Microservices Loosely coupled Elasticity Horizontal scalability Operations
  • 7. Components of a streaming architecture Producer Message Buffer Topic A Topic B Consumer Producer Producer Producer Producer Producer Schema Repository Consumer Consumer Consumer Consumer
  • 8. Checkpoint: Amazon SQS, Amazon Kinesis and Amazon Managed Streaming for Kafka • Traditional Messaging semantics • Transparent scaling • Individual message delay • Dead letter queues • Multiple Consumers • Native AWS Integrations • Fully Managed • Control over ordering • Highly configurable retention • Managed Kafka and Zookeeper • Existing applications • Full configurability • Log compaction Amazon Simple Queue Service Amazon Kinesis Amazon Managed Streaming for Kafka Kafka on Elastic Compute Cloud (EC2)
  • 10. Producers: AWS native AWS Database Migration Service Kinesis Producer Library Kinesis Agent AWS IoT Amazon Connect Amazon Pinpoint Amazon DynamoDB Streams AWS Tools and SDKs Amazon API Gateway Amazon EMR
  • 13. Real-time Fully managed Scalable Secure Cost effective Amazon EMR/Spark Custom code on Amazon EC2 Amazon S3 Amazon Redshift Splunk Ingest, store data streams Amazon Kinesis Data Streams Amazon Kinesis Data Analytics Aggregate, filter, enrich data Amazon Kinesis Data Firehose Egress data streams AWS Lambda Amazon Elasticsearch Service Amazon Kinesis Data Streaming Collect Process and analyse data streams in real-time
  • 15. Apache Kafka anatomy 101 Producer Broker Broker Broker Consumer Cluster Zookeeper Producer
  • 17. Consumers Kinesis Consumer Library Kinesis Agent AWS Lambda Amazon EMR Amazon Kinesis Data Analytics Amazon Kinesis Data Firehose SDK’s
  • 18. Apache Flink Framework and distributed engine for stateful processing of data streams. Simple programming High performance Stateful Processing Strong data integrity Easy to use and flexible APIs make building apps fast In-memory computing provides low latency & high throughput Durable application state saves Exactly-once processing and consistent state
  • 19. Getting started Amazon Managed Streaming for Kafka Amazon Elasticsearch Service { "accountId": "11185", "transactionId": 1155776, "isDetailAvailable": "true", "type": ”PAYMENT", "status": ”POSTED", "description": "Credit card payment to Sample Shop", "postingDateTime": 1553207924271, "valueDateTime": "22-03-2019 09:38:44", "executionDateTime": 1553207924271, "amount": 767.8318, "currency": "AUD", "reference": "AKG62DHBB5", "merchantName": ”Sample shop : Purchase", "merchantCategoryCode": "4112" }`
  • 20. Getting started: Infrastructure VPC AWS Cloud Availability Zone 1 Availability Zone 2 Instance Broker Instance Broker JMeter Node Node Availability Zone 3 Instance Broker Node Amazon ECS for Kubernetes Amazon Managed Streaming for Kafka
  • 21. Sample data { "accountId": "11158", "transactionId": “126649”, "isDetailAvailable": true, "type": "PAYMENT", "status": "POSTED", "description": "Credit card payment to AccountY", "postingDateTime": "09-04-2019 17:17:44", "valueDateTime": "09-04-2019 17:17:44", "executionDateTime": "09-04-2019 17:17:44", "amount": 534.24, "currency": ”AUD", "reference": "BF6GCJ2EDE", "merchantName": ”MerchantA", "merchantCategoryCode": "3150" }
  • 22. Demo
  • 24. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. SQL over the stream Table result1 = tableEnv.sqlQuery("select transactionId," + " amount, " + " accountId, " + " currency, " + " type " + "from Transactions " + " where currency = 'BAD' and amount > 995 and type = 'TRANSFER_OUTGOING'" ); • Familiar syntax • Joins across streams • Aggregations
  • 25. Aggregations and windowing Sliding select TUMBLE_START(created_date, INTERVAL '20' SECOND) as wStart, TUMBLE_END(created_date, INTERVAL '20' SECOND) as wEnd, SUM(amount), AVG(amount), COUNT(*) from Transactions GROUP BY TUMBLE(created_date, INTERVAL '20' SECOND) Tumbling Session
  • 26. Categorise transactions from a stream Amazon Managed Streaming for Kafka Mobile client Users Amazon Simple Notification Service Amazon API GatewayAmazon DynamoDB Amazon SageMaker Amazon Managed Streaming for Kafka
  • 27. Grow with your requirements Amazon Managed Streaming for Kafka Mobile client Amazon API Gateway Amazon Simple Storage Service (S3) Amazon Athena Amazon RedshiftAmazon Elasticsearch Service DesktopAmazon RDS AWS Lambda Amazon Neptune Amazon EC2 Analyst
  • 28. Grow with your requirements Amazon Managed Streaming for Kafka Mobile client Amazon API Gateway Amazon Simple Storage Service (S3) Amazon Athena Amazon RedshiftAmazon Elasticsearch Service DesktopAmazon RDS AWS Lambda Amazon Neptune Amazon EC2 Analyst
  • 30. Data61 - CSIRO • Standards developed as part of the introduction in Australia of the Consumer Data Right legislation to give Australians greater control over their data • The Consumer Data Right is intended to apply sector by sector across the whole economy, beginning in the banking, energy and telecommunications sectors. • https://consumerdatastandardsaustralia.github.io
  • 31. Resources • Free online training - https://www.aws.training • Flink Homepage - https://flink.apache.org/ • AWS Big Data Blog • https://aws.amazon.com/blogs/big-data/ • Real-time bushfire alerting with Complex Event Processing in Apache Flink on Amazon EMR and IoT sensor network • Confluent - https://www.confluent.io/blog/ • Designing Data-Intensive Applications by Martin Kleppmann