SlideShare une entreprise Scribd logo
1  sur  32
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ganesh Raja
Solutions Architect – Data & Analytics, Amazon Web Services
Real Time Data Ingestion And Analysis
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Streams Are Everywhere
• Most data is continuously produced as a stream
• Processing Data as it arrives is becoming very popular
• Many diverse applications and use cases
Streaming Ingest-
Transform-Load
Continuous Metric
Generation
Actionable Insights
Compute analytics as the data is generated
React to analytics based off of insights
Deliver data to analytics tools faster and cheaper
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
It’s All About The Pace
Hourly server logs
Weekly or monthly bills
Daily web-site clickstream
Daily fraud reports
Batch Processing
Real time metrics
Real time spending alerts/caps
Real time clickstream analysis
Real time detection
Stream Processing
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Diminishing Value Of Data
Recent data is highly valuable
• If you act on it in time
• Perishable Insights (M. Gualtieri,
Forrester)
Old + Recent data is even more
valuable
• If you have the means to combine them
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Simple Pattern For Streaming Data
Continuously creates
data
Continuously writes data
to a stream
Can be almost anything
Data Producer
Durably stores data
Provides a temporary
buffer that prepares data
Supports very high-
throughput
Streaming Service
Continuously processes
data
Cleans, prepares, &
aggregates data
Transforms data into
information
Data Consumer
Mobile Clients Amazon Kinesis Amazon Kinesis app
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Firehose
Build custom
applications that
process and analyse
streaming data
Easily load streaming
data into AWS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Streams
• Easy administration and low cost
• Build real time applications with a framework of choice
• Secure and durable storage
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Firehose
• Zero administration and seamless elasticity
• Direct-to-data store integration
• Serverless and continuous data transformations
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Anomaly Detection on AWS CloudTrail Logs
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Anomaly Detection
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing CloudTrail Event Logs
AWS
CloudTrail
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
Deliver to a real time
dashboard and archive
Compute
operational metrics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ingest And Deliver AWS Cloudtrail Events
• AWS CloudTrail provides continuous
account activity logging
• Events are sent in near real time to
Amazon Kinesis Data Firehose and
Streams
• Each event includes a timestamp, the
AWS IAM user or AWS service name,
API call, response and more.
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Amazon Kinesis
Data Streams
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Stream Data To Amazon Kinesis
Automatic ingestion Easy setup Write your own
Amazon
VPC Flow
Logs
Elastic Load
Balancing
Amazon
RDS
Amazon
CloudWatch Logs
AWS
CloudTrail Event
Logs
Amazon
Pinpoint
Amazon API
Gateway
AWS IoT
events
AWS SDKs
Amazon
DynamoDB
Amazon
Kinesis Agent
Amazon
Kinesis
Producer
Library
As a proxy:
For change data capture:
Just a sample… many more ways stream data to Amazon Kinesis
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing CloudTrail Event Logs
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
AWS
CloudTrail
Deliver to a real time
dashboard and archive
Amazon EMR
Data Analytics
Compute
operational metrics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Compute Operational Metrics In Real Time
Compute metrics using SQL in real time like:
• Total calls by IP, service, API call, AWS IAM user
• Amazon S3 API failures (or any other service)
• Anomalous behavior of Amazon S3 API (or any
other service)
• Top 10 API calls across all services
Amazon EMR
Data Analytics
Raw data Real time
analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How Do We Aggregate Streaming Data?
• A common requirement in streaming analytics
is to perform set-based operation(s) (count,
average, max, min,..) over events that arrive
within a specified period of time
• Cannot simply aggregate over an entire table
like typical static database
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Windowing Concepts
• Windows can be tumbling or sliding
• Windows are of fixed length
1 5 4 26 8 6 4
t1 t2 t5t3 t4
Time
Window1 Window2 Window3
Aggregate
Function(Sum)
18 14Output Events
t6
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing CloudTrail Event Logs
AWS
CloudTrail
Compute
operational metrics
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
Amazon EMR
Data Analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Persist Data For Real Time Dashboards
• Use Amazon Kinesis Data
Firehose to archive processed
data to Amazon S3
• Use AWS Lambda to deliver
data to Amazon DynamoDB
(or another database)
• Open source or other tools to
visualise the data
Real time
analytics
AWS Lambda
function
Amazon S3 bucket
for processed data
Amazon
DynamoDB
Table(s)
Redash
Dashboard
Amazon Kinesis
Data Stream
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing CloudTrail Event Logs
AWS
CloudTrail
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
Amazon S3 bucket
for processed data
AWS Lambda
function
Amazon
DynamoDB
Table(s)
Chart.JS
Dashboard
Deliver to a real time
dashboard and archive
Amazon Kinesis
Data Streams
Compute
operational metrics
Amazon EMR
Data Analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DEMO
Analyse AWS CloudTrail Logs using
Amazon EMR
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing CloudTrail Event Logs
AWS
CloudTrail
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
Amazon S3 bucket
for processed data
AWS Lambda
function
Amazon
DynamoDB
Table(s)
Chart.JS
Dashboard
Deliver to a real time
dashboard and archive
Amazon Kinesis
Data Streams
Compute
operational metrics
Amazon EMR
Data Analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Invent And Simplify
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Firehose
Build custom
applications that
process and analyse
streaming data
Easily load streaming
data into AWS
Amazon Kinesis
Data Analytics
Easily process and
analyse streaming data
with standard SQL
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Analytics
• Powerful real time applications
• Easy to use, fully managed
• Automatic elasticity
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Kinesis Data Analytics Applications
Easily write SQL code to process streaming data
Connect to a streaming source
Continuously deliver SQL results
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing AWS CloudTrail Event Logs
AWS
CloudTrail
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
Amazon S3 bucket
for processed data
AWS Lambda
function
Amazon
DynamoDB
Table(s)
Chart.JS
Dashboard
Deliver to a real time
dashboards and archival
Amazon Kinesis
Data Streams
Compute
operational metrics
Amazon EMR
Data Analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing AWS CloudTrail Event Logs
AWS
CloudTrail
Compute
operational metrics
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
Amazon S3 bucket
for processed data
AWS Lambda
function
Amazon
DynamoDB
Table(s)
Redash
Dashboard
Deliver to a real time
dashboards and archival
Amazon Kinesis
Data Streams
Amazon Kinesis
Data Analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DEMO
Amazon Kinesis Data Analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Analysing AWS CloudTrail Event Logs
AWS
CloudTrail
Amazon
CloudWatch
events trigger
Amazon S3
bucket for raw
data
Ingest and deliver raw
log data
Amazon Kinesis
Data Streams
Amazon S3 bucket
for processed data
AWS Lambda
function
Amazon
DynamoDB
Table(s)
Chart.JS
Dashboard
Deliver to a real time
dashboards and archival
Amazon Kinesis
Data Streams
Compute
operational metrics
Amazon Kinesis
Data Analytics
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Try It Out Yourself
Go to aws.amazon.com/kinesis/
Some good examples:
• A click through template for AWS CloudTrail Event Log Analytics –
https://tinyurl.com/RTInsights
• A Click through template for Real-Time Web Analytics with Kinesis
Data Analytics - https://tinyurl.com/RTWebAnalytics
• Blog Posts on Kinesis - https://tinyurl.com/KinesisBlogs
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank You

Contenu connexe

Tendances

Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSightAmazon Web Services
 
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesIntroduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesAmazon Web Services
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWSAmazon Web Services
 
Welcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution OverviewWelcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution OverviewAmazon Web Services
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Query in Place with AWS (STG315-R1) - AWS re:Invent 2018
Query in Place with AWS (STG315-R1) - AWS re:Invent 2018Query in Place with AWS (STG315-R1) - AWS re:Invent 2018
Query in Place with AWS (STG315-R1) - AWS re:Invent 2018Amazon Web Services
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Amazon Web Services
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSightAmazon Web Services
 
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...Amazon Web Services
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Amazon Web Services
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsAmazon Web Services
 
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started DemoIan Massingham
 

Tendances (20)

Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSight
 
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesIntroduction to Amazon Kinesis Firehose - AWS August Webinar Series
Introduction to Amazon Kinesis Firehose - AWS August Webinar Series
 
Data Warehouses and Data Lakes
Data Warehouses and Data LakesData Warehouses and Data Lakes
Data Warehouses and Data Lakes
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWS
 
Welcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution OverviewWelcome and AWS Big Data Solution Overview
Welcome and AWS Big Data Solution Overview
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
 
Preparing Data for the Lake
Preparing Data for the LakePreparing Data for the Lake
Preparing Data for the Lake
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Query in Place with AWS (STG315-R1) - AWS re:Invent 2018
Query in Place with AWS (STG315-R1) - AWS re:Invent 2018Query in Place with AWS (STG315-R1) - AWS re:Invent 2018
Query in Place with AWS (STG315-R1) - AWS re:Invent 2018
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSight
 
Preparing Data for the Lake
Preparing Data for the LakePreparing Data for the Lake
Preparing Data for the Lake
 
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
Stream Data Analytics with Amazon Kinesis Firehose & Redshift - AWS August We...
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
 
How Amazon.com uses AWS Analytics
How Amazon.com uses AWS AnalyticsHow Amazon.com uses AWS Analytics
How Amazon.com uses AWS Analytics
 
Data Warehouses and Data Lakes
Data Warehouses and Data LakesData Warehouses and Data Lakes
Data Warehouses and Data Lakes
 
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
2014 Import.io Data Summit - Including Hadoop/Impala Getting Started Demo
 

Similaire à Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018

Analyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF LoftAnalyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF LoftAmazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Amazon Web Services
 
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Amazon Web Services
 
ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
ABD301-Analyzing Streaming Data in Real Time with Amazon KinesisABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
ABD301-Analyzing Streaming Data in Real Time with Amazon KinesisAmazon Web Services
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Amazon Web Services
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Amazon Web Services
 
Analyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAnalyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAmazon Web Services
 
WildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopWildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopAmazon Web Services
 
Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Amazon Web Services
 
AWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAdir Sharabi
 
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopWild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopAWS Germany
 
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018Amazon Web Services
 
Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...
Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...
Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...Amazon Web Services
 
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)Adir Sharabi
 
From Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time AnalyticsFrom Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time AnalyticsAmazon Web Services
 
Serverless Architectural Patterns
Serverless Architectural PatternsServerless Architectural Patterns
Serverless Architectural PatternsAmazon Web Services
 

Similaire à Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018 (20)

Analyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF LoftAnalyzing Streams: Data Analytics Week at the SF Loft
Analyzing Streams: Data Analytics Week at the SF Loft
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Analyzing Streams
Analyzing StreamsAnalyzing Streams
Analyzing Streams
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
Running Your SQL Server Database on Amazon RDS (DAT329) - AWS re:Invent 2018
 
ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
ABD301-Analyzing Streaming Data in Real Time with Amazon KinesisABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
ABD301-Analyzing Streaming Data in Real Time with Amazon Kinesis
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
Building a Data Lake in Amazon S3 & Amazon Glacier (STG401-R1) - AWS re:Inven...
 
Analyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAnalyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon Kinesis
 
WildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopWildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing Workshop
 
Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28
 
AWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWS
 
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopWild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
 
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
Keeping the Pace with Data Ingestion (GPSCT402) - AWS re:Invent 2018
 
Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...
Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...
Monitor All Your Things: Amazon CloudWatch in Action with BBC (DEV302) - AWS ...
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
AWS Floor28 - WildRydes Serverless Data Processsing workshop (Ver2)
 
From Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time AnalyticsFrom Batch to Streaming - How Amazon Flex Uses Real-time Analytics
From Batch to Streaming - How Amazon Flex Uses Real-time Analytics
 
Serverless Architectural Patterns
Serverless Architectural PatternsServerless Architectural Patterns
Serverless Architectural Patterns
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Real Time Data Ingestion & Analysis - AWS Summit Sydney 2018

  • 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ganesh Raja Solutions Architect – Data & Analytics, Amazon Web Services Real Time Data Ingestion And Analysis
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Streams Are Everywhere • Most data is continuously produced as a stream • Processing Data as it arrives is becoming very popular • Many diverse applications and use cases Streaming Ingest- Transform-Load Continuous Metric Generation Actionable Insights Compute analytics as the data is generated React to analytics based off of insights Deliver data to analytics tools faster and cheaper
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. It’s All About The Pace Hourly server logs Weekly or monthly bills Daily web-site clickstream Daily fraud reports Batch Processing Real time metrics Real time spending alerts/caps Real time clickstream analysis Real time detection Stream Processing
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Diminishing Value Of Data Recent data is highly valuable • If you act on it in time • Perishable Insights (M. Gualtieri, Forrester) Old + Recent data is even more valuable • If you have the means to combine them
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Simple Pattern For Streaming Data Continuously creates data Continuously writes data to a stream Can be almost anything Data Producer Durably stores data Provides a temporary buffer that prepares data Supports very high- throughput Streaming Service Continuously processes data Cleans, prepares, & aggregates data Transforms data into information Data Consumer Mobile Clients Amazon Kinesis Amazon Kinesis app
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Build custom applications that process and analyse streaming data Easily load streaming data into AWS
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Streams • Easy administration and low cost • Build real time applications with a framework of choice • Secure and durable storage
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Firehose • Zero administration and seamless elasticity • Direct-to-data store integration • Serverless and continuous data transformations
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Anomaly Detection on AWS CloudTrail Logs
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Anomaly Detection
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Deliver to a real time dashboard and archive Compute operational metrics
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ingest And Deliver AWS Cloudtrail Events • AWS CloudTrail provides continuous account activity logging • Events are sent in near real time to Amazon Kinesis Data Firehose and Streams • Each event includes a timestamp, the AWS IAM user or AWS service name, API call, response and more. Amazon CloudWatch events trigger Amazon S3 bucket for raw data Amazon Kinesis Data Streams
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Stream Data To Amazon Kinesis Automatic ingestion Easy setup Write your own Amazon VPC Flow Logs Elastic Load Balancing Amazon RDS Amazon CloudWatch Logs AWS CloudTrail Event Logs Amazon Pinpoint Amazon API Gateway AWS IoT events AWS SDKs Amazon DynamoDB Amazon Kinesis Agent Amazon Kinesis Producer Library As a proxy: For change data capture: Just a sample… many more ways stream data to Amazon Kinesis
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams AWS CloudTrail Deliver to a real time dashboard and archive Amazon EMR Data Analytics Compute operational metrics
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Compute Operational Metrics In Real Time Compute metrics using SQL in real time like: • Total calls by IP, service, API call, AWS IAM user • Amazon S3 API failures (or any other service) • Anomalous behavior of Amazon S3 API (or any other service) • Top 10 API calls across all services Amazon EMR Data Analytics Raw data Real time analytics
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How Do We Aggregate Streaming Data? • A common requirement in streaming analytics is to perform set-based operation(s) (count, average, max, min,..) over events that arrive within a specified period of time • Cannot simply aggregate over an entire table like typical static database
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Windowing Concepts • Windows can be tumbling or sliding • Windows are of fixed length 1 5 4 26 8 6 4 t1 t2 t5t3 t4 Time Window1 Window2 Window3 Aggregate Function(Sum) 18 14Output Events t6
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Compute operational metrics Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon EMR Data Analytics
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Persist Data For Real Time Dashboards • Use Amazon Kinesis Data Firehose to archive processed data to Amazon S3 • Use AWS Lambda to deliver data to Amazon DynamoDB (or another database) • Open source or other tools to visualise the data Real time analytics AWS Lambda function Amazon S3 bucket for processed data Amazon DynamoDB Table(s) Redash Dashboard Amazon Kinesis Data Stream
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboard and archive Amazon Kinesis Data Streams Compute operational metrics Amazon EMR Data Analytics
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DEMO Analyse AWS CloudTrail Logs using Amazon EMR
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboard and archive Amazon Kinesis Data Streams Compute operational metrics Amazon EMR Data Analytics
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Invent And Simplify
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Amazon Kinesis Data Streams Amazon Kinesis Data Firehose Build custom applications that process and analyse streaming data Easily load streaming data into AWS Amazon Kinesis Data Analytics Easily process and analyse streaming data with standard SQL
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Analytics • Powerful real time applications • Easy to use, fully managed • Automatic elasticity
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Kinesis Data Analytics Applications Easily write SQL code to process streaming data Connect to a streaming source Continuously deliver SQL results
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing AWS CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboards and archival Amazon Kinesis Data Streams Compute operational metrics Amazon EMR Data Analytics
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing AWS CloudTrail Event Logs AWS CloudTrail Compute operational metrics Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Redash Dashboard Deliver to a real time dashboards and archival Amazon Kinesis Data Streams Amazon Kinesis Data Analytics
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. DEMO Amazon Kinesis Data Analytics
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Analysing AWS CloudTrail Event Logs AWS CloudTrail Amazon CloudWatch events trigger Amazon S3 bucket for raw data Ingest and deliver raw log data Amazon Kinesis Data Streams Amazon S3 bucket for processed data AWS Lambda function Amazon DynamoDB Table(s) Chart.JS Dashboard Deliver to a real time dashboards and archival Amazon Kinesis Data Streams Compute operational metrics Amazon Kinesis Data Analytics
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Try It Out Yourself Go to aws.amazon.com/kinesis/ Some good examples: • A click through template for AWS CloudTrail Event Log Analytics – https://tinyurl.com/RTInsights • A Click through template for Real-Time Web Analytics with Kinesis Data Analytics - https://tinyurl.com/RTWebAnalytics • Blog Posts on Kinesis - https://tinyurl.com/KinesisBlogs
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank You