SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
Analytics at Amazon
Darin Briskman
Product Manager
AWS Database, Analytics, Machine Learning, & Blockchain
Briskman@amazon.com
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Traditionally, analytics looked like this
Relational data
GBs-TBs scale [not designed for PB/EBs]
Expensive: Large initial capex + $10K-$50K/TB/year
90% of data was thrown away because of cost
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our beliefs
1. The purpose of analytics is to help people make
better decisions
2. All data has value. No data should be thrown
away.
3. Everyone should have access to all data (subject to
access rules).
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Snowball
Snowmobile Kinesis
Data Firehose
Kinesis
Data Streams
S3
Redshift
EMR
Athena Kinesis
Elasticsearch Service
Data lakes on AWS
Kinesis
Video Streams
AI Services
QuickSight
Exabyte scale
Store and analyze relational and non-relational data
Purpose-built analytics tools
Cost effective
• Store at 2.3 cents per GB-month in Amazon S3
• Query with Amazon Athena at ½ cent per GB scanned
• DW with Amazon Redshift for $1,000/TB/year
Give access to everyone
• Amazon QuickSight: $0.30 for 30 minutes of use
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Flywheel
CHALLENGE
Need to create constant feedback
loop for designers.
Gain up-to-the-minute
understanding of gamer
satisfaction to guarantee gamers
are engaged, resulting in the most
popular game played in the world.
Fortnite | 125+ million players
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Epic Games uses data lakes and analytics
Entire analytics platform running on AWS
Amazon S3 leveraged as a data lake
All telemetry data is collected with Amazon Kinesis
Real-time analytics done through Spark on Amazon EMR,
DynamoDB to create scoreboards and real-time queries
Use Amazon EMR for large batch data processing
Game designers use data to inform their decisions
Game
clients
Game
servers
Launcher
Game
services
N E A R R E A L T I M E P I P E L I N E
N E A R R E A L T I M E P I P E L I N E
Grafana
Scoreboards API
Limited raw data
(real time ad-hoc SQL)
User ETL
(metric definition)
Spark on EMR DynamoDB
NEAR REAL-TIME PIPELINES
BATCH PIPELINES
ETL using
EMR
Tableau/BI
Ad-hoc SQLS3
(Data lake)
Kinesis
APIs
Databases
S3
Other
sources
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
CHALLENGE
Needed to analyze data to find
insights, identify opportunities, and
evaluate business performance.
The Oracle DW did not scale, was
difficult to maintain, and costly.
SOLUTION
Deployed a data lake with Amazon S3,
and run analytics with Amazon
Redshift, Amazon Redshift Spectrum,
and Amazon EMR.
Result: They doubled the data stored
(100PB), lowered costs, and was able
to gain insights faster.
50 PB of data
600,000 analytics jobs/day
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Data Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is the Goal?
To Provide an analytic ecosystem that Scales with the
Amazon Business
To Leverage AWS Technologies and to help Improve these
technologies for all Amazon Customers
To Provide Choice and Options in New Analytic Technologies
Provide an SQL based solution
Increasingly Focus on Enabling new analytic approaches
including Machine Learning and Programmatic Data Analysis
Enable both “Bring Your Own Cluster” and “Bring your Own
Query” Approaches
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“Tools #2” by Juan Pablo Olmo. No alterations other than cropping. https://www.flickr.com/photos/juanpol/1562101472/
Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EMR
(running Hive, Pig,
Spark, Presto, etc…)
Amazon DynamoDB
Amazon
Machine Learning
Amazon QuickSight
Amazon RDS
Amazon Elasticsearch
Service
Amazon Redshift Amazon Athena
Amazon SQS
Amazon Kinesis
Analytics
Amazon Kinesis
Firehose
Amazon S3
Amazon Kinesis
Open-source tools
(e.g. for ML, data science)
Commercial tools
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Moving Forward - AWS
S3 / EDX - Separate
Storage from Compute by
leveraging a parallel file
system as a global data
exchange
• Redshift - Preferred
platform SQL based
Analysis and traditional
Data Warehouse Data
• Focus is “Business Users”
• EMR – Scalable “Do
Everything” Platform - Enable
Teams who have chosen EMR
by providing Curated Data
• Focus is “Programattic Access”
Amazon
Redshift
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Amazon “Data Lake” – Project Name “Andes”
The Goal: ”THE” Place for Data at Amazon
• Source teams (Data Producers) put their Public Data there to give access to Analytic
teams (Data Consumers) and to share private data within their team
• EMR Can Directly Access the Data in Parallel from Andes
• Redshift can load the data in Parallel from Andes, or it Can Directly Access the Data in
Parallel with Spectrum
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Putting The Pieces Together
The Analytic Architecture of the Future
Source
Systems
The Data Lake
“Andes”
Big Data Systems
Data Warehouses
“Bring Your Own Cluster” and
“Bring Your Own Query”
Services and Users
Postgre SQL
instance
Amazon
Redshift
Amazon
Redshift
Amazon
Redshift
Amazon
Kinesis
AWS Glue Amazon
QuickSight
Amazon
Athena
AmazonMachine
Learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Table	Subscriptions	- The	Vision
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Value Chain
Image credits: Icons from thenounproject.com: “Collect” icon by Ramesh; “Cloud Security” icon by Creative Stall; “Search” icon by
Dinosoft Labs;
“Shopping Cart” icon by Gregor Cresnar; “Cloud Upload Download” icon by naim; “Data science” icon by Becris
COLLECT STORE DELIVER ANALYZESUBSCRIBEDISCOVER
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Lake Formation
Build a secure data lake in days
Move, store, catalog, and
clean your data faster
Move, store, catalog,
and clean your data faster
with machine learning
Enforce security policies
across multiple services
Enforce security policies across
multiple services
Gain and manage new
insights
Empower analyst and data
scientist to gain and manage
new insights
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How it works
Data lakes and analytics on AWS
S3
IAM KMS
OLTP
ERP
CRM
LOB
Devices
Web
Sensors
Social Kinesis
Build data lakes quickly
• Identify, crawl, and catalog sources
• Ingest and clean data
• Transform into optimal formats
Simplify security management
• Enforce encryption
• Define access policies
• Implement audit login
Enable self-service and combined analytics
• Analysts discover all data available for analysis
from a single data catalog
• Use multiple analytics tools over the same data
Athena
Redshift
AI Services
EMR
QuickSight
Data
catalog
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How it works
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Glue—Serverless Data catalog & ETL service
Data Catalog
ETL Job
authoring
Discover data and
extract schema
Auto-generates
customizable ETL code
in Python and Spark
Automatically discovers data and stores schema
Data searchable, and available for ETL
Generates customizable code
Schedules and runs your ETL jobs
Serverless
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EMR
Updated with the latest open
source frameworks within 30
days of release
Process data directly in the
S3 data lake securely with
high performance using the
EMRFS connector
Launch fully managed
Hadoop & Spark in minutes;
no cluster setup, node
provisioning, cluster tuning
Flexible billing with per-
second billing, EC2 spot,
reserved instances and
auto-scaling to reduce
costs 50–80%
Latest versions Use S3 storage EasyLow cost
T
h
e
p
i
c
t
u
r
e
c
a
n
'
t
b
e
d
i
s
p
l
a
Analytics and ML at scale
19 open-source projects: Apache Hadoop, Spark, HBase, Presto, and more
Enterprise-grade security
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Elasticsearch Service
Fully managed;
Deploy production-ready
clusters in minutes
Secure access with VPC to
keep all traffic within AWS
network
Zone awareness replicates
data between two AZs;
automatically monitors &
replaces failed nodes
Direct access to
Elasticsearch open-source
APIs; supports Logstash
and Kibana
Easy to Use Secure AvailableOpen
Easy to deploy, secure, operate, and scale Elasticsearch
Customers use Elasticsearch for log analytics, full-text search & application
monitoring
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Athena
Zero setup cost; just point to S3
and start querying
ANSI SQL interface,
JDBC/ODBC drivers, multiple
formats, compression types,
and complex joins and data
types
Serverless: zero
infrastructure, zero
administration
Integrated with QuickSight
Pay only for queries run;
save 30–90% on per-query
costs through compression
Query Instantly Open EasyPay per query
Interactive query service to analyze data in Amazon S3 using standard SQL
No infrastructure to set up or manage and no data to load
Ability to run SQL queries on data archived in Amazon Glacier
SQL
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon QuickSight
First BI service with pay-per-session pricing for everyone in your organization
Serverless, cloud-powered BI service (no servers to manage)
Scale from 10s of users to 100s of thousands of users
Pay only for what you use
• Readers: $0.30/30 min session with a $5/user/month max
• Authors: $18/month/Author
Integrates with S3, Athena, Redshift, RDS, Aurora, & EMR
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Directory Service
Microsoft AD
Custom Date Format Dashboard Save As Aggregate Calculations Readers Groups
Private VPC
25 GB SPICE
tables
Spark and Presto Connector Scheduled refresh Just In Time Provisioning One-click upgrade
Search Totals Excel Custom Range
100+
new features released since
launch
Federated SSO
Athena connector Export to CSV S3 Analytics
Week Aggregation Aurora PostgreSQL Calculations in SPICE
Cross Account
S3 Access
Aggregate Filters Hourly refresh
Row level security Hourly refresh
10K Filter Values On-screen controls
Redshift Spectrum
Support
KPI Chart
Spark Connector
AWS Directory Service
AD Connector
Tabular Reports Data labels
URL Actions
Combo Charts
Audit logging
with CloudTrail Geospatial maps Count Distinct Parameters Relative Date Filters Filter Groups
Table calculations Snowflake Connector SaaS Connectors Teradata Connector HIPAA PCI compliance
Amazon QuickSight has been innovating quickly
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon QuickSight—embedded dashboards
Supercharge your applications with embedded dashboards
Fully interactive with drill down, filtering, & external links
No servers to manage, no long-term commitments
Pay for usage with pay-per-session reader pricing
Easy embedding with JavaScript SDK
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Embedded NFL Next Gen Stats Dashboards
“With the Amazon QuickSight Readers and
pay-per-session pricing, we are able to
extend these secure, customized and easy
to use dashboards for each club without
having to provision servers or manage
infrastructure – all while only paying for
actual usage.”
Matt Swensson
Vice President, Emerging Products and Technology
Real-time stats for NFL games
Embedded in NFL Next Gen Stats Portal
Shared with 100s of users across NFL,
32 clubs and broadcast partners
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon QuickSight is used by customers at the largest scale
One of the world’s largest
metals and mining companies
deployed Amazon QuickSight
with its critical risk
management (CRM) solution
to ensure employee safety.
Thousands of employees
use its CRM globally.
Uses Amazon QuickSight
embedded in its Converge
Platform, a governance, risk,
and compliance healthcare
solution. Tens of thousands
of users across 900
healthcare organizations
use this platform.
Amazon.com is using
Amazon QuickSight
company-wide
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon QuickSight—ML Insights
Automated business insights powered by ML and natural language
ML-powered anomaly detection
ML-powered forecasting
Auto-narratives
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Discover all the hidden trends and
anomalies on millions of metrics
Amazon QuickSight—ML Insights
Example: anomaly detection
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“Sales for office supplies in APAC
was 15% above expected.”
Amazon QuickSight—ML Insights
Example: anomaly detection
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“SMB Segment was the top
contributor.”
Amazon QuickSight—ML Insights
Example: anomaly detection
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“It’s significant because SMB
typically only accounts for 30% of
sales.”
Amazon QuickSight—ML Insights
Example: anomaly detection
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
QuickSight ML-powered forecasting Traditional BI forecasting
Captures seasonality and upward trends
Automatically excludes bad data
High confidence band
Captures only seasonality
Missing upward trend
Confidence band influenced by bad data
QuickSight ML Insights vs. traditional BI forecasting
VS.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Insights in plain language narrative
Embedded within your dashboard
No more staring at dashboards for hours!
Fully customizable to meet every need
No coding needed. Easy-to-use UI templates.
Amazon QuickSight—ML Insights
Auto-narratives

Contenu connexe

Tendances

AWS Initiate - Transformação Digital Usando Machine Learning
AWS Initiate - Transformação Digital Usando Machine LearningAWS Initiate - Transformação Digital Usando Machine Learning
AWS Initiate - Transformação Digital Usando Machine LearningAmazon Web Services LATAM
 
AWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and Tricks
AWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and TricksAWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and Tricks
AWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and TricksAWS Summits
 
Rendi le tue app più smart con i servizi AI di AWS
Rendi le tue app più smart con i servizi AI di AWSRendi le tue app più smart con i servizi AI di AWS
Rendi le tue app più smart con i servizi AI di AWSAmazon Web Services
 
AWS Transformation Day 2018 - Charlotte NC
AWS Transformation Day 2018 - Charlotte NCAWS Transformation Day 2018 - Charlotte NC
AWS Transformation Day 2018 - Charlotte NCAmazon Web Services
 
AWS Summit Singapore 2019 | Amazon Digital User Engagement Solutions
AWS Summit Singapore 2019 | Amazon Digital User Engagement SolutionsAWS Summit Singapore 2019 | Amazon Digital User Engagement Solutions
AWS Summit Singapore 2019 | Amazon Digital User Engagement SolutionsAWS Summits
 
Machine Learning Key Lessons Learned for Developers
Machine Learning Key Lessons Learned for DevelopersMachine Learning Key Lessons Learned for Developers
Machine Learning Key Lessons Learned for DevelopersAmazon Web Services
 
AWS Initiate - Otimização de Custos com AWS
AWS Initiate - Otimização de Custos com AWSAWS Initiate - Otimização de Custos com AWS
AWS Initiate - Otimização de Custos com AWSAmazon Web Services LATAM
 
AWS Summit Singapore 2019 | Realising Business Value with AWS Analytics Services
AWS Summit Singapore 2019 | Realising Business Value with AWS Analytics ServicesAWS Summit Singapore 2019 | Realising Business Value with AWS Analytics Services
AWS Summit Singapore 2019 | Realising Business Value with AWS Analytics ServicesAWS Summits
 
AWS Initiate - Landing Zone: Como saber se sua base está preparada
AWS Initiate - Landing Zone: Como saber se sua base está preparadaAWS Initiate - Landing Zone: Como saber se sua base está preparada
AWS Initiate - Landing Zone: Como saber se sua base está preparadaAmazon Web Services LATAM
 
Introduction to AI
Introduction to AIIntroduction to AI
Introduction to AIBoaz Ziniman
 
Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...
Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...
Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...Amazon Web Services
 
Innovate - Building Intelligent Applications (No Machine Learning Experience ...
Innovate - Building Intelligent Applications (No Machine Learning Experience ...Innovate - Building Intelligent Applications (No Machine Learning Experience ...
Innovate - Building Intelligent Applications (No Machine Learning Experience ...Amazon Web Services
 
Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...
Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...
Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...Amazon Web Services
 
AWS Summit Singapore 2019 | Realising Business Value
AWS Summit Singapore 2019 | Realising Business ValueAWS Summit Singapore 2019 | Realising Business Value
AWS Summit Singapore 2019 | Realising Business ValueAWS Summits
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Amazon Web Services
 
Cloud Backend for Real-time Applications
Cloud Backend for Real-time ApplicationsCloud Backend for Real-time Applications
Cloud Backend for Real-time ApplicationsAmazon Web Services
 

Tendances (20)

AWS Initiate - Transformação Digital Usando Machine Learning
AWS Initiate - Transformação Digital Usando Machine LearningAWS Initiate - Transformação Digital Usando Machine Learning
AWS Initiate - Transformação Digital Usando Machine Learning
 
AWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and Tricks
AWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and TricksAWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and Tricks
AWS Summit Singapore 2019 | Hiring a Global Rock Star Team: Tips and Tricks
 
APN-live-hk-opening
APN-live-hk-openingAPN-live-hk-opening
APN-live-hk-opening
 
Rendi le tue app più smart con i servizi AI di AWS
Rendi le tue app più smart con i servizi AI di AWSRendi le tue app più smart con i servizi AI di AWS
Rendi le tue app più smart con i servizi AI di AWS
 
AWS Transformation Day 2018 - Charlotte NC
AWS Transformation Day 2018 - Charlotte NCAWS Transformation Day 2018 - Charlotte NC
AWS Transformation Day 2018 - Charlotte NC
 
AWS Summit Singapore 2019 | Amazon Digital User Engagement Solutions
AWS Summit Singapore 2019 | Amazon Digital User Engagement SolutionsAWS Summit Singapore 2019 | Amazon Digital User Engagement Solutions
AWS Summit Singapore 2019 | Amazon Digital User Engagement Solutions
 
AWS reInvent 2017 Recap Webinar
AWS reInvent 2017 Recap WebinarAWS reInvent 2017 Recap Webinar
AWS reInvent 2017 Recap Webinar
 
Machine Learning Key Lessons Learned for Developers
Machine Learning Key Lessons Learned for DevelopersMachine Learning Key Lessons Learned for Developers
Machine Learning Key Lessons Learned for Developers
 
AWS Initiate - Otimização de Custos com AWS
AWS Initiate - Otimização de Custos com AWSAWS Initiate - Otimização de Custos com AWS
AWS Initiate - Otimização de Custos com AWS
 
AWS Summit Singapore 2019 | Realising Business Value with AWS Analytics Services
AWS Summit Singapore 2019 | Realising Business Value with AWS Analytics ServicesAWS Summit Singapore 2019 | Realising Business Value with AWS Analytics Services
AWS Summit Singapore 2019 | Realising Business Value with AWS Analytics Services
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
AWS Initiate - Landing Zone: Como saber se sua base está preparada
AWS Initiate - Landing Zone: Como saber se sua base está preparadaAWS Initiate - Landing Zone: Como saber se sua base está preparada
AWS Initiate - Landing Zone: Como saber se sua base está preparada
 
Introduction to AI
Introduction to AIIntroduction to AI
Introduction to AI
 
Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...
Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...
Keynote: What Transformation Really Means for the Enterprise - Virtual Transf...
 
Tendências na Transformação Digital
Tendências na Transformação DigitalTendências na Transformação Digital
Tendências na Transformação Digital
 
Innovate - Building Intelligent Applications (No Machine Learning Experience ...
Innovate - Building Intelligent Applications (No Machine Learning Experience ...Innovate - Building Intelligent Applications (No Machine Learning Experience ...
Innovate - Building Intelligent Applications (No Machine Learning Experience ...
 
Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...
Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...
Amazon SageMaker sviluppa, addestra e distribuisci modelli di Machine Learnin...
 
AWS Summit Singapore 2019 | Realising Business Value
AWS Summit Singapore 2019 | Realising Business ValueAWS Summit Singapore 2019 | Realising Business Value
AWS Summit Singapore 2019 | Realising Business Value
 
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019
 
Cloud Backend for Real-time Applications
Cloud Backend for Real-time ApplicationsCloud Backend for Real-time Applications
Cloud Backend for Real-time Applications
 

Similaire à Value of Data Beyond Analytics by Darin Briskman

Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSAmazon Web Services
 
Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018
Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018
Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018Amazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudAmazon Web Services
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesBuild Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesAmazon Web Services
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices
Build Data Lakes and Analytics on AWS: Patterns & Best PracticesBuild Data Lakes and Analytics on AWS: Patterns & Best Practices
Build Data Lakes and Analytics on AWS: Patterns & Best PracticesAmazon Web Services
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeAmazon Web Services
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAmazon Web Services
 
Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Amazon Web Services
 
AWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAdir Sharabi
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Amazon Web Services
 

Similaire à Value of Data Beyond Analytics by Darin Briskman (20)

Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdf
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018
Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018
Big Data on AWS - To infinity and beyond! - Tel Aviv Summit 2018
 
Construindo data lakes e analytics com AWS
Construindo data lakes e analytics com AWSConstruindo data lakes e analytics com AWS
Construindo data lakes e analytics com AWS
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best PracticesBuild Data Lakes & Analytics on AWS: Patterns & Best Practices
Build Data Lakes & Analytics on AWS: Patterns & Best Practices
 
Build Data Lakes and Analytics on AWS: Patterns & Best Practices
Build Data Lakes and Analytics on AWS: Patterns & Best PracticesBuild Data Lakes and Analytics on AWS: Patterns & Best Practices
Build Data Lakes and Analytics on AWS: Patterns & Best Practices
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_Singapore
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
 
Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28Building Data Lake on AWS | AWS Floor28
Building Data Lake on AWS | AWS Floor28
 
AWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWSAWS Floor 28 - Building Data lake on AWS
AWS Floor 28 - Building Data lake on AWS
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
 

Dernier

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Dernier (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Value of Data Beyond Analytics by Darin Briskman

  • 1. Analytics at Amazon Darin Briskman Product Manager AWS Database, Analytics, Machine Learning, & Blockchain Briskman@amazon.com
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Traditionally, analytics looked like this Relational data GBs-TBs scale [not designed for PB/EBs] Expensive: Large initial capex + $10K-$50K/TB/year 90% of data was thrown away because of cost OLTP ERP CRM LOB Data Warehouse Business Intelligence
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our beliefs 1. The purpose of analytics is to help people make better decisions 2. All data has value. No data should be thrown away. 3. Everyone should have access to all data (subject to access rules).
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Snowball Snowmobile Kinesis Data Firehose Kinesis Data Streams S3 Redshift EMR Athena Kinesis Elasticsearch Service Data lakes on AWS Kinesis Video Streams AI Services QuickSight Exabyte scale Store and analyze relational and non-relational data Purpose-built analytics tools Cost effective • Store at 2.3 cents per GB-month in Amazon S3 • Query with Amazon Athena at ½ cent per GB scanned • DW with Amazon Redshift for $1,000/TB/year Give access to everyone • Amazon QuickSight: $0.30 for 30 minutes of use
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Flywheel
  • 6. CHALLENGE Need to create constant feedback loop for designers. Gain up-to-the-minute understanding of gamer satisfaction to guarantee gamers are engaged, resulting in the most popular game played in the world. Fortnite | 125+ million players
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Epic Games uses data lakes and analytics Entire analytics platform running on AWS Amazon S3 leveraged as a data lake All telemetry data is collected with Amazon Kinesis Real-time analytics done through Spark on Amazon EMR, DynamoDB to create scoreboards and real-time queries Use Amazon EMR for large batch data processing Game designers use data to inform their decisions Game clients Game servers Launcher Game services N E A R R E A L T I M E P I P E L I N E N E A R R E A L T I M E P I P E L I N E Grafana Scoreboards API Limited raw data (real time ad-hoc SQL) User ETL (metric definition) Spark on EMR DynamoDB NEAR REAL-TIME PIPELINES BATCH PIPELINES ETL using EMR Tableau/BI Ad-hoc SQLS3 (Data lake) Kinesis APIs Databases S3 Other sources
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. CHALLENGE Needed to analyze data to find insights, identify opportunities, and evaluate business performance. The Oracle DW did not scale, was difficult to maintain, and costly. SOLUTION Deployed a data lake with Amazon S3, and run analytics with Amazon Redshift, Amazon Redshift Spectrum, and Amazon EMR. Result: They doubled the data stored (100PB), lowered costs, and was able to gain insights faster. 50 PB of data 600,000 analytics jobs/day
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Data Analytics
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. What is the Goal? To Provide an analytic ecosystem that Scales with the Amazon Business To Leverage AWS Technologies and to help Improve these technologies for all Amazon Customers To Provide Choice and Options in New Analytic Technologies Provide an SQL based solution Increasingly Focus on Enabling new analytic approaches including Machine Learning and Programmatic Data Analysis Enable both “Bring Your Own Cluster” and “Bring your Own Query” Approaches
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. “Tools #2” by Juan Pablo Olmo. No alterations other than cropping. https://www.flickr.com/photos/juanpol/1562101472/ Image used with permissions under Creative Commons license 2.0, Attribution Generic License (https://creativecommons.org/licenses/by/2.0/)
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EMR (running Hive, Pig, Spark, Presto, etc…) Amazon DynamoDB Amazon Machine Learning Amazon QuickSight Amazon RDS Amazon Elasticsearch Service Amazon Redshift Amazon Athena Amazon SQS Amazon Kinesis Analytics Amazon Kinesis Firehose Amazon S3 Amazon Kinesis Open-source tools (e.g. for ML, data science) Commercial tools
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Moving Forward - AWS S3 / EDX - Separate Storage from Compute by leveraging a parallel file system as a global data exchange • Redshift - Preferred platform SQL based Analysis and traditional Data Warehouse Data • Focus is “Business Users” • EMR – Scalable “Do Everything” Platform - Enable Teams who have chosen EMR by providing Curated Data • Focus is “Programattic Access” Amazon Redshift
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Amazon “Data Lake” – Project Name “Andes” The Goal: ”THE” Place for Data at Amazon • Source teams (Data Producers) put their Public Data there to give access to Analytic teams (Data Consumers) and to share private data within their team • EMR Can Directly Access the Data in Parallel from Andes • Redshift can load the data in Parallel from Andes, or it Can Directly Access the Data in Parallel with Spectrum
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Putting The Pieces Together The Analytic Architecture of the Future Source Systems The Data Lake “Andes” Big Data Systems Data Warehouses “Bring Your Own Cluster” and “Bring Your Own Query” Services and Users Postgre SQL instance Amazon Redshift Amazon Redshift Amazon Redshift Amazon Kinesis AWS Glue Amazon QuickSight Amazon Athena AmazonMachine Learning
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Table Subscriptions - The Vision
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Data Value Chain Image credits: Icons from thenounproject.com: “Collect” icon by Ramesh; “Cloud Security” icon by Creative Stall; “Search” icon by Dinosoft Labs; “Shopping Cart” icon by Gregor Cresnar; “Cloud Upload Download” icon by naim; “Data science” icon by Becris COLLECT STORE DELIVER ANALYZESUBSCRIBEDISCOVER
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Lake Formation Build a secure data lake in days Move, store, catalog, and clean your data faster Move, store, catalog, and clean your data faster with machine learning Enforce security policies across multiple services Enforce security policies across multiple services Gain and manage new insights Empower analyst and data scientist to gain and manage new insights
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How it works Data lakes and analytics on AWS S3 IAM KMS OLTP ERP CRM LOB Devices Web Sensors Social Kinesis Build data lakes quickly • Identify, crawl, and catalog sources • Ingest and clean data • Transform into optimal formats Simplify security management • Enforce encryption • Define access policies • Implement audit login Enable self-service and combined analytics • Analysts discover all data available for analysis from a single data catalog • Use multiple analytics tools over the same data Athena Redshift AI Services EMR QuickSight Data catalog
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How it works
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Glue—Serverless Data catalog & ETL service Data Catalog ETL Job authoring Discover data and extract schema Auto-generates customizable ETL code in Python and Spark Automatically discovers data and stores schema Data searchable, and available for ETL Generates customizable code Schedules and runs your ETL jobs Serverless
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EMR Updated with the latest open source frameworks within 30 days of release Process data directly in the S3 data lake securely with high performance using the EMRFS connector Launch fully managed Hadoop & Spark in minutes; no cluster setup, node provisioning, cluster tuning Flexible billing with per- second billing, EC2 spot, reserved instances and auto-scaling to reduce costs 50–80% Latest versions Use S3 storage EasyLow cost T h e p i c t u r e c a n ' t b e d i s p l a Analytics and ML at scale 19 open-source projects: Apache Hadoop, Spark, HBase, Presto, and more Enterprise-grade security
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Elasticsearch Service Fully managed; Deploy production-ready clusters in minutes Secure access with VPC to keep all traffic within AWS network Zone awareness replicates data between two AZs; automatically monitors & replaces failed nodes Direct access to Elasticsearch open-source APIs; supports Logstash and Kibana Easy to Use Secure AvailableOpen Easy to deploy, secure, operate, and scale Elasticsearch Customers use Elasticsearch for log analytics, full-text search & application monitoring
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Athena Zero setup cost; just point to S3 and start querying ANSI SQL interface, JDBC/ODBC drivers, multiple formats, compression types, and complex joins and data types Serverless: zero infrastructure, zero administration Integrated with QuickSight Pay only for queries run; save 30–90% on per-query costs through compression Query Instantly Open EasyPay per query Interactive query service to analyze data in Amazon S3 using standard SQL No infrastructure to set up or manage and no data to load Ability to run SQL queries on data archived in Amazon Glacier SQL
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon QuickSight First BI service with pay-per-session pricing for everyone in your organization Serverless, cloud-powered BI service (no servers to manage) Scale from 10s of users to 100s of thousands of users Pay only for what you use • Readers: $0.30/30 min session with a $5/user/month max • Authors: $18/month/Author Integrates with S3, Athena, Redshift, RDS, Aurora, & EMR
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Directory Service Microsoft AD Custom Date Format Dashboard Save As Aggregate Calculations Readers Groups Private VPC 25 GB SPICE tables Spark and Presto Connector Scheduled refresh Just In Time Provisioning One-click upgrade Search Totals Excel Custom Range 100+ new features released since launch Federated SSO Athena connector Export to CSV S3 Analytics Week Aggregation Aurora PostgreSQL Calculations in SPICE Cross Account S3 Access Aggregate Filters Hourly refresh Row level security Hourly refresh 10K Filter Values On-screen controls Redshift Spectrum Support KPI Chart Spark Connector AWS Directory Service AD Connector Tabular Reports Data labels URL Actions Combo Charts Audit logging with CloudTrail Geospatial maps Count Distinct Parameters Relative Date Filters Filter Groups Table calculations Snowflake Connector SaaS Connectors Teradata Connector HIPAA PCI compliance Amazon QuickSight has been innovating quickly
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon QuickSight—embedded dashboards Supercharge your applications with embedded dashboards Fully interactive with drill down, filtering, & external links No servers to manage, no long-term commitments Pay for usage with pay-per-session reader pricing Easy embedding with JavaScript SDK
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Embedded NFL Next Gen Stats Dashboards “With the Amazon QuickSight Readers and pay-per-session pricing, we are able to extend these secure, customized and easy to use dashboards for each club without having to provision servers or manage infrastructure – all while only paying for actual usage.” Matt Swensson Vice President, Emerging Products and Technology Real-time stats for NFL games Embedded in NFL Next Gen Stats Portal Shared with 100s of users across NFL, 32 clubs and broadcast partners
  • 29.
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon QuickSight is used by customers at the largest scale One of the world’s largest metals and mining companies deployed Amazon QuickSight with its critical risk management (CRM) solution to ensure employee safety. Thousands of employees use its CRM globally. Uses Amazon QuickSight embedded in its Converge Platform, a governance, risk, and compliance healthcare solution. Tens of thousands of users across 900 healthcare organizations use this platform. Amazon.com is using Amazon QuickSight company-wide
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon QuickSight—ML Insights Automated business insights powered by ML and natural language ML-powered anomaly detection ML-powered forecasting Auto-narratives
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Discover all the hidden trends and anomalies on millions of metrics Amazon QuickSight—ML Insights Example: anomaly detection
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. “Sales for office supplies in APAC was 15% above expected.” Amazon QuickSight—ML Insights Example: anomaly detection
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. “SMB Segment was the top contributor.” Amazon QuickSight—ML Insights Example: anomaly detection
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. “It’s significant because SMB typically only accounts for 30% of sales.” Amazon QuickSight—ML Insights Example: anomaly detection
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. QuickSight ML-powered forecasting Traditional BI forecasting Captures seasonality and upward trends Automatically excludes bad data High confidence band Captures only seasonality Missing upward trend Confidence band influenced by bad data QuickSight ML Insights vs. traditional BI forecasting VS.
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Insights in plain language narrative Embedded within your dashboard No more staring at dashboards for hours! Fully customizable to meet every need No coding needed. Easy-to-use UI templates. Amazon QuickSight—ML Insights Auto-narratives