SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Principal Technical Evangelist, AWS
@julsimon
Julien Simon
A Data Journey with AWS
What does
“Digital”
really mean?
Turning
Data
into
Business Value
…through
Software
Jeff Immelt, GE Chairman & CEO
“If you went to bed last night as an industrial company, you’re going
to wake up this morning as a software and analytics company.”
Navigating the seven seas of Big Data
You
Your boss
Your infrastructure
Over 1 Million Active
Customers
“Active customer” is defined as a non-Amazon customer with AWS account usage activity in the past month, including the free tier
2008 2009 2010 2011 2012 2013 20152014
Big Data: let’s keep it simple!
Collect Store Analyze Consume
Time to Answer (Latency)
Throughput
Cost
Collect Store Analyze Consume
A
iOS
 Android
Web Apps
Logstash
Amazon
RDS
Amazon
DynamoDB
Amazon
ES
Amazon

S3
Apache
Kafka
Amazon

Glacier
Amazon

Kinesis
Amazon

DynamoDB
Amazon
Redshift
Impala
Pig
Amazon ML
Streaming
Amazon

Kinesis
AWS
Lambda
AmazonElasticMapReduce
Amazon
ElastiCache
SearchSQLNoSQLCache
StreamProcessing
Batch
Interactive
Logging
StreamStorage
IoT
Applications
FileStorage
Analysis&Visualization
Hot
Cold
Warm
Hot
Slow
Hot
ML
Fast
Fast
Amazon
QuickSight
Transactional Data
File Data
Stream Data
Notebooks
Predictions
Apps & APIs
Mobile
Apps
IDE
Search Data
ETL
DATA WAREHOUSING
understanding the past
Nasdaq
In 2014, Nasdaq replaced the existing data warehouses
for its US equities and options exchanges with Amazon
Redshift.
On a nightly basis, Nasdaq loads approximately 5 billion
rows of data into Redshift within a 4-6 hour window.
Amazon Redshift now powers a number of data analytics
applications at Nasdaq, including its billing system for US
customers. In 2015, Nasdaq is expanding its use of
Redshift to its global exchange properties.
https://aws.amazon.com/solutions/case-studies/nasdaq-finqloud/
MAP REDUCE
processing huge amounts of data
for fun and (hopefully) profit
FINRA
FINRA, the primary regulatory agency for broker-dealers in
the US, uses AWS extensively in their IT operations and has
migrated key portions of its technology stack to AWS including
Market Surveillance and Member Regulation.
For market surveillance, each night FINRA loads
approximately 35 billion rows of data into Amazon S3 and
Amazon EMR (up to 10,000 nodes) to monitor trading activity
on exchanges and market centers in the US.
https://aws.amazon.com/solutions/case-studies/finra/
REAL-TIME ANALYTICS
understanding the present
Hearst
Hearst Corporation is one of the largest diversified
communications company in the world.
The company began migrating 10 of its 29 global data
centers to AWS to reduce its IT infrastructure footprint.
Hearst Corporation monitors trending content on 250+
sites worldwide. To facilitate this, Hearst built a
clickstream analytics platform on AWS that transmits and
processes over 30 TB of data a day using AWS
resources.
https://aws.amazon.com/solutions/case-studies/hearst-data-anayltics/
LARGE-SCALE
SIMULATIONS
looking at possible futures
Kellogg Company
Kellogg needed a solution that could accommodate
terabytes of data, scale according to infrastructure
needs, and stay within its budget.
Amazon Web Services (AWS) offered a fully SAP-
certified HANA environment on a public cloud platform.
Kellogg decided to start immediately with test and
development environments for its US operations.
These Amazon EC2 instances process 16 TB of sales
data weekly from promotions in the US, modeling dozens
of data simulations a day.
https://aws.amazon.com/solutions/case-studies/kellogg-company/
AON Benfield
Aon Benfield is the world's leading reinsurance intermediary
and full-service capital advisor.
Aon Benfield Analytics offers industry-leading catastrophe
management, actuarial, rating agency advisory and risk and
capital strategy expertise.
By using AWS GPU instances, Aon Benfield is able to perform
actuarial calculations with greater computing power, in shorter
time frames, and for less cost than on-premise deployments
and CPU cores: “Using AWS helps us reduce a 10-day
process to 10 minutes”
https://aws.amazon.com/solutions/case-studies/aon/
MACHINE LEARNING
using the past to predict the future
Amazon.com
Amazon is a leader in practical machine learning solutions and uses it
in hundreds of services across its various businesses.
Amazon uses Machine Learning for Customer Support, building
models based on recent orders, click-stream, user devices, prime
membership usage, recent cases, recent account changes, etc.
The models are used to provide efficient self-service to our
customers.
Amazon.com
Not all customer interactions can be solved in a self-service mode.
Therefore, Amazon operates large customer support centers where
Customer Service Representatives (CSR) handle customer requests.
The machine learning models described above are used to optimize the
human interactions of these requests.
For example, they are used to route the customer call to the best CSR
before the customer has even started to speak! They are also used again
during the call.
ICAO
The International Civil Aviation Organization (ICAO) is a specialized agency
of the United Nations. ICAO collects accident and incident data every day,
including information about the location, aircraft, operator, flight phase, as
well as narratives around these data points.
Amazon Machine Learning is used to determine the classification and risk
categorization of each event based on patterns contained in the flight phase
information and the narratives.
Amazon ML ingests the raw data and the narratives every day to provide
ICAO a daily list of accidents categorized by risk. This allows ICAO to
provide updated accidents statistics to its customers on a daily basis.
Fraud.net
“We considered five other
platforms, but Amazon Machine
Learning was the best solution.
Amazon keeps the effort and
resources required to build a
model to a minimum.
Using Amazon Machine
Learning, we've quickly created
and trained a number of specific,
targeted models, rather than
building a single algorithm to try
and capture all the different forms
of fraud.”
Whitney Anderson,
CEO
http://aws.amazon.com/fr/solutions/case-studies/fraud-dot-net/
DEEP LEARNING
Finding complex patterns
Amazon.com
Amazon uses product recommendation in many services.
Amazon DSSTNE (aka ‘Destiny’)
•  Deep Scalable Sparse Tensor Network Engine
•  Open source software library for deep neural networks using GPUs :
https://github.com/amznlabs/amazon-dsstne
•  Used by Amazon.com for product recommendation
•  Multi-GPU scale for training and prediction
•  Large Layers: larger networks than are possible with a single GPU
•  Sparse Data: optimized for fast performance on sparse datasets
•  Can run locally, in a Docker container or on AWS (multi-instance possible)
Amazon Destiny vs Google TensorFlow
“DSSTNE on a single virtualized K520 GPU (released in 2012) is
faster than TensorFlow on a bare metal Tesla M40 (released in
2015)”
“TensorFlow does not provide the automagic model parallelism
provided by DSSTNE”
https://medium.com/@scottlegrand/first-dsstne-benchmarks-tldr-almost-15x-faster-than-tensorflow-393dbeb80c0f
Demo: Amazon Destiny
http://grouplens.org/datasets/movielens/
27,000 movies
138,000 users
20 million movie recommendations
(Matrix is 99.5% sparse)
		 Alice	 Bob	 Charlie	 David	 Ernest	
Star	Wars	 1	 1	 1	
Lord	of	the	Rings	 1	 1	 1	
Incep>on	 1	
Bambi	 1	 1	
PreAy	Woman	 1	 1	
		 Alice	 Bob	 Charlie	 David	 Ernest	
Star	Wars	 0.23	 1	 0.12	 1	 1	
Lord	of	the	Rings	 1	 0.34	 1	 0.89	 1	
Incep>on	 0.8	 1	 0.43	 0.76	 0.45	
Bambi	 1	 0.42	 0.5	 1	 0.34	
PreAy	Woman	 1	 0.09	 0.67	 0.04	 1	
à Start a g2.8xlarge instance
4 K520 GPUs: 6144 CUDA cores
à Train a neural network
Input & output layers: 27,000 neurons
3 hidden layers, 128 neurons each
à Recommend 10 movies per user
Training Amazon Destiny on multiple GPUs
Resources
Big Data Architectural Patterns and Best Practices on AWS
https://www.youtube.com/watch?v=K7o5OlRLtvU
Real-World Smart Applications With Amazon Machine Learning
https://www.youtube.com/watch?v=sHJx1KJf8p0
Deep Learning: Going Beyond Machine Learning
https://www.youtube.com/watch?v=Ra6m70d3t0o
AWS Big Data blog: https://blogs.aws.amazon.com/bigdata/
à “Generating Recommendations at Amazon Scale with Apache Spark
and Amazon DSSTNE”
AWS User Groups
Lille
Paris
Rennes
Nantes
Bordeaux
Lyon
Montpellier
Toulouse
facebook.com/groups/AWSFrance/
@aws_actus
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
Principal Technical Evangelist, AWS
@julsimon
Julien Simon

Contenu connexe

Tendances

Tendances (20)

Building Your First Big Data Application on AWS
Building Your First Big Data Application on AWSBuilding Your First Big Data Application on AWS
Building Your First Big Data Application on AWS
 
Gaining Operational Insights out of Your Logs
Gaining Operational Insights out of Your LogsGaining Operational Insights out of Your Logs
Gaining Operational Insights out of Your Logs
 
Get the Most Bang for your Buck with #EC2 #Winning
Get the Most Bang for your Buck with #EC2 #WinningGet the Most Bang for your Buck with #EC2 #Winning
Get the Most Bang for your Buck with #EC2 #Winning
 
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
 
AWS CloudFormation (February 2016)
AWS CloudFormation (February 2016)AWS CloudFormation (February 2016)
AWS CloudFormation (February 2016)
 
Trustpilot
TrustpilotTrustpilot
Trustpilot
 
Scale, baby, scale! (June 2016)
Scale, baby, scale! (June 2016)Scale, baby, scale! (June 2016)
Scale, baby, scale! (June 2016)
 
AWS re:Invent 2016 : announcement, technical demos and feedbacks
AWS re:Invent 2016 : announcement, technical demos and feedbacksAWS re:Invent 2016 : announcement, technical demos and feedbacks
AWS re:Invent 2016 : announcement, technical demos and feedbacks
 
Using Amazon Cloudwatch Events, AWS Lambda and Spark Streaming to Process EC2...
Using Amazon Cloudwatch Events, AWS Lambda and Spark Streaming to Process EC2...Using Amazon Cloudwatch Events, AWS Lambda and Spark Streaming to Process EC2...
Using Amazon Cloudwatch Events, AWS Lambda and Spark Streaming to Process EC2...
 
Understanding The Benefits Of Amazon EC2
Understanding The Benefits Of Amazon EC2Understanding The Benefits Of Amazon EC2
Understanding The Benefits Of Amazon EC2
 
How to run your Hadoop Cluster in 10 minutes
How to run your Hadoop Cluster in 10 minutesHow to run your Hadoop Cluster in 10 minutes
How to run your Hadoop Cluster in 10 minutes
 
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
Announcing AWS Batch - Run Batch Jobs At Scale - December 2016 Monthly Webina...
 
Amazon Web Services and Docker: from developing to production
Amazon Web Services and Docker: from developing to productionAmazon Web Services and Docker: from developing to production
Amazon Web Services and Docker: from developing to production
 
CI&CD with AWS - AWS Prague User Group - May 2015
CI&CD with AWS - AWS Prague User Group - May 2015CI&CD with AWS - AWS Prague User Group - May 2015
CI&CD with AWS - AWS Prague User Group - May 2015
 
Introduction to AWS Batch
Introduction to AWS BatchIntroduction to AWS Batch
Introduction to AWS Batch
 
AWS September Webinar Series - Running Microservices with Amazon EC2 Contain...
AWS September Webinar Series -  Running Microservices with Amazon EC2 Contain...AWS September Webinar Series -  Running Microservices with Amazon EC2 Contain...
AWS September Webinar Series - Running Microservices with Amazon EC2 Contain...
 
AWS CloudFormation Best Practices
AWS CloudFormation Best PracticesAWS CloudFormation Best Practices
AWS CloudFormation Best Practices
 
Introduction to Amazon Lightsail
Introduction to Amazon LightsailIntroduction to Amazon Lightsail
Introduction to Amazon Lightsail
 
AWS re:Invent 2016: Busting the Myth of Vendor Lock-In: How D2L Embraced the...
AWS re:Invent 2016: Busting the Myth of Vendor Lock-In:  How D2L Embraced the...AWS re:Invent 2016: Busting the Myth of Vendor Lock-In:  How D2L Embraced the...
AWS re:Invent 2016: Busting the Myth of Vendor Lock-In: How D2L Embraced the...
 
Introduction to Amazon EC2
Introduction to Amazon EC2Introduction to Amazon EC2
Introduction to Amazon EC2
 

En vedette

AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
Amazon Web Services
 

En vedette (6)

FinQLOUD platform for digital banking
FinQLOUD platform for digital bankingFinQLOUD platform for digital banking
FinQLOUD platform for digital banking
 
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
 
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)
 
AWS re:Invent 2016: Learn How FINRA Aligns Billions of Time Ordered Events wi...
AWS re:Invent 2016: Learn How FINRA Aligns Billions of Time Ordered Events wi...AWS re:Invent 2016: Learn How FINRA Aligns Billions of Time Ordered Events wi...
AWS re:Invent 2016: Learn How FINRA Aligns Billions of Time Ordered Events wi...
 
Big Data and Business Insight
Big Data and Business InsightBig Data and Business Insight
Big Data and Business Insight
 
AWS re:Invent 2016: Securing Enterprise Big Data Workloads on AWS (SEC308)
AWS re:Invent 2016: Securing Enterprise Big Data Workloads on AWS (SEC308)AWS re:Invent 2016: Securing Enterprise Big Data Workloads on AWS (SEC308)
AWS re:Invent 2016: Securing Enterprise Big Data Workloads on AWS (SEC308)
 

Similaire à A Data Journey With AWS

Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)
Amazon Web Services Korea
 

Similaire à A Data Journey With AWS (20)

AWS Big Data combo
AWS Big Data comboAWS Big Data combo
AWS Big Data combo
 
Moving forward with AI
Moving forward with AIMoving forward with AI
Moving forward with AI
 
Building prediction models with Amazon Redshift and Amazon Machine Learning -...
Building prediction models with Amazon Redshift and Amazon Machine Learning -...Building prediction models with Amazon Redshift and Amazon Machine Learning -...
Building prediction models with Amazon Redshift and Amazon Machine Learning -...
 
Machine Learning for everyone
Machine Learning for everyoneMachine Learning for everyone
Machine Learning for everyone
 
Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)Big data on_aws in korea by abhishek sinha (lunch and learn)
Big data on_aws in korea by abhishek sinha (lunch and learn)
 
[Jun AWS 101] Running Lean on AWS
[Jun AWS 101] Running Lean on AWS[Jun AWS 101] Running Lean on AWS
[Jun AWS 101] Running Lean on AWS
 
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
Best Practices for Distributed Machine Learning and Predictive Analytics Usin...
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Big Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS CloudBig Data Use Cases and Solutions in the AWS Cloud
Big Data Use Cases and Solutions in the AWS Cloud
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016  Webi...
Evolving Your Big Data Use Cases from Batch to Real-Time - AWS May 2016 Webi...
 
Introduction to Cloud Computing with AWS (Thai Session)
Introduction to Cloud Computing with AWS (Thai Session)Introduction to Cloud Computing with AWS (Thai Session)
Introduction to Cloud Computing with AWS (Thai Session)
 
Build Data Lakes and Analytics on AWS
Build Data Lakes and Analytics on AWS Build Data Lakes and Analytics on AWS
Build Data Lakes and Analytics on AWS
 
GPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital MarketsGPSTEC305-Machine Learning in Capital Markets
GPSTEC305-Machine Learning in Capital Markets
 
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of AmazonBig Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
 
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWSAWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
AWS Summit Stockholm 2014 – B4 – Business intelligence on AWS
 
An Introduction to the AI services at AWS - AWS Summit Tel Aviv 2017
An Introduction to the AI services at AWS - AWS Summit Tel Aviv 2017An Introduction to the AI services at AWS - AWS Summit Tel Aviv 2017
An Introduction to the AI services at AWS - AWS Summit Tel Aviv 2017
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
A look at the market leader and giant "Amazon AWS
A look at the market leader and giant "Amazon AWSA look at the market leader and giant "Amazon AWS
A look at the market leader and giant "Amazon AWS
 
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
AWS Summit 2013 | Singapore - Big Data Analytics, Presented by AWS, Intel and...
 

Plus de Julien SIMON

Plus de Julien SIMON (20)

An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with Transformers
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
 
Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

A Data Journey With AWS

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Principal Technical Evangelist, AWS @julsimon Julien Simon A Data Journey with AWS
  • 5. Jeff Immelt, GE Chairman & CEO “If you went to bed last night as an industrial company, you’re going to wake up this morning as a software and analytics company.”
  • 6. Navigating the seven seas of Big Data
  • 8. Over 1 Million Active Customers “Active customer” is defined as a non-Amazon customer with AWS account usage activity in the past month, including the free tier 2008 2009 2010 2011 2012 2013 20152014
  • 9. Big Data: let’s keep it simple! Collect Store Analyze Consume Time to Answer (Latency) Throughput Cost
  • 10. Collect Store Analyze Consume A iOS Android Web Apps Logstash Amazon RDS Amazon DynamoDB Amazon ES Amazon
 S3 Apache Kafka Amazon
 Glacier Amazon
 Kinesis Amazon
 DynamoDB Amazon Redshift Impala Pig Amazon ML Streaming Amazon
 Kinesis AWS Lambda AmazonElasticMapReduce Amazon ElastiCache SearchSQLNoSQLCache StreamProcessing Batch Interactive Logging StreamStorage IoT Applications FileStorage Analysis&Visualization Hot Cold Warm Hot Slow Hot ML Fast Fast Amazon QuickSight Transactional Data File Data Stream Data Notebooks Predictions Apps & APIs Mobile Apps IDE Search Data ETL
  • 12. Nasdaq In 2014, Nasdaq replaced the existing data warehouses for its US equities and options exchanges with Amazon Redshift. On a nightly basis, Nasdaq loads approximately 5 billion rows of data into Redshift within a 4-6 hour window. Amazon Redshift now powers a number of data analytics applications at Nasdaq, including its billing system for US customers. In 2015, Nasdaq is expanding its use of Redshift to its global exchange properties. https://aws.amazon.com/solutions/case-studies/nasdaq-finqloud/
  • 13. MAP REDUCE processing huge amounts of data for fun and (hopefully) profit
  • 14. FINRA FINRA, the primary regulatory agency for broker-dealers in the US, uses AWS extensively in their IT operations and has migrated key portions of its technology stack to AWS including Market Surveillance and Member Regulation. For market surveillance, each night FINRA loads approximately 35 billion rows of data into Amazon S3 and Amazon EMR (up to 10,000 nodes) to monitor trading activity on exchanges and market centers in the US. https://aws.amazon.com/solutions/case-studies/finra/
  • 16. Hearst Hearst Corporation is one of the largest diversified communications company in the world. The company began migrating 10 of its 29 global data centers to AWS to reduce its IT infrastructure footprint. Hearst Corporation monitors trending content on 250+ sites worldwide. To facilitate this, Hearst built a clickstream analytics platform on AWS that transmits and processes over 30 TB of data a day using AWS resources. https://aws.amazon.com/solutions/case-studies/hearst-data-anayltics/
  • 18. Kellogg Company Kellogg needed a solution that could accommodate terabytes of data, scale according to infrastructure needs, and stay within its budget. Amazon Web Services (AWS) offered a fully SAP- certified HANA environment on a public cloud platform. Kellogg decided to start immediately with test and development environments for its US operations. These Amazon EC2 instances process 16 TB of sales data weekly from promotions in the US, modeling dozens of data simulations a day. https://aws.amazon.com/solutions/case-studies/kellogg-company/
  • 19. AON Benfield Aon Benfield is the world's leading reinsurance intermediary and full-service capital advisor. Aon Benfield Analytics offers industry-leading catastrophe management, actuarial, rating agency advisory and risk and capital strategy expertise. By using AWS GPU instances, Aon Benfield is able to perform actuarial calculations with greater computing power, in shorter time frames, and for less cost than on-premise deployments and CPU cores: “Using AWS helps us reduce a 10-day process to 10 minutes” https://aws.amazon.com/solutions/case-studies/aon/
  • 20. MACHINE LEARNING using the past to predict the future
  • 21. Amazon.com Amazon is a leader in practical machine learning solutions and uses it in hundreds of services across its various businesses. Amazon uses Machine Learning for Customer Support, building models based on recent orders, click-stream, user devices, prime membership usage, recent cases, recent account changes, etc. The models are used to provide efficient self-service to our customers.
  • 22.
  • 23. Amazon.com Not all customer interactions can be solved in a self-service mode. Therefore, Amazon operates large customer support centers where Customer Service Representatives (CSR) handle customer requests. The machine learning models described above are used to optimize the human interactions of these requests. For example, they are used to route the customer call to the best CSR before the customer has even started to speak! They are also used again during the call.
  • 24. ICAO The International Civil Aviation Organization (ICAO) is a specialized agency of the United Nations. ICAO collects accident and incident data every day, including information about the location, aircraft, operator, flight phase, as well as narratives around these data points. Amazon Machine Learning is used to determine the classification and risk categorization of each event based on patterns contained in the flight phase information and the narratives. Amazon ML ingests the raw data and the narratives every day to provide ICAO a daily list of accidents categorized by risk. This allows ICAO to provide updated accidents statistics to its customers on a daily basis.
  • 25. Fraud.net “We considered five other platforms, but Amazon Machine Learning was the best solution. Amazon keeps the effort and resources required to build a model to a minimum. Using Amazon Machine Learning, we've quickly created and trained a number of specific, targeted models, rather than building a single algorithm to try and capture all the different forms of fraud.” Whitney Anderson, CEO http://aws.amazon.com/fr/solutions/case-studies/fraud-dot-net/
  • 27. Amazon.com Amazon uses product recommendation in many services.
  • 28. Amazon DSSTNE (aka ‘Destiny’) •  Deep Scalable Sparse Tensor Network Engine •  Open source software library for deep neural networks using GPUs : https://github.com/amznlabs/amazon-dsstne •  Used by Amazon.com for product recommendation •  Multi-GPU scale for training and prediction •  Large Layers: larger networks than are possible with a single GPU •  Sparse Data: optimized for fast performance on sparse datasets •  Can run locally, in a Docker container or on AWS (multi-instance possible)
  • 29. Amazon Destiny vs Google TensorFlow “DSSTNE on a single virtualized K520 GPU (released in 2012) is faster than TensorFlow on a bare metal Tesla M40 (released in 2015)” “TensorFlow does not provide the automagic model parallelism provided by DSSTNE” https://medium.com/@scottlegrand/first-dsstne-benchmarks-tldr-almost-15x-faster-than-tensorflow-393dbeb80c0f
  • 30. Demo: Amazon Destiny http://grouplens.org/datasets/movielens/ 27,000 movies 138,000 users 20 million movie recommendations (Matrix is 99.5% sparse) Alice Bob Charlie David Ernest Star Wars 1 1 1 Lord of the Rings 1 1 1 Incep>on 1 Bambi 1 1 PreAy Woman 1 1 Alice Bob Charlie David Ernest Star Wars 0.23 1 0.12 1 1 Lord of the Rings 1 0.34 1 0.89 1 Incep>on 0.8 1 0.43 0.76 0.45 Bambi 1 0.42 0.5 1 0.34 PreAy Woman 1 0.09 0.67 0.04 1 à Start a g2.8xlarge instance 4 K520 GPUs: 6144 CUDA cores à Train a neural network Input & output layers: 27,000 neurons 3 hidden layers, 128 neurons each à Recommend 10 movies per user
  • 31. Training Amazon Destiny on multiple GPUs
  • 32. Resources Big Data Architectural Patterns and Best Practices on AWS https://www.youtube.com/watch?v=K7o5OlRLtvU Real-World Smart Applications With Amazon Machine Learning https://www.youtube.com/watch?v=sHJx1KJf8p0 Deep Learning: Going Beyond Machine Learning https://www.youtube.com/watch?v=Ra6m70d3t0o AWS Big Data blog: https://blogs.aws.amazon.com/bigdata/ à “Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE”
  • 34. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you! Principal Technical Evangelist, AWS @julsimon Julien Simon