SlideShare a Scribd company logo
1 of 37
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rahul Pathak, Amazon EMR
March 30, 2016
Building Big Data Solutions with
Amazon EMR & Amazon Redshift
Agenda
• AWS Big Data Platform Overview
• Amazon EMR & Amazon Redshift
• Building a Big Data Application
• Customer Use Cases
AWS Big Data Platform
EMR EC2
Analyze
Glacier
S3
Store
Import Export
Collect
Kinesis
Direct Connect
Machine
Learning
Redshift
Amazon
QuickSight
DynamoDB
Amazon EMR – Managed Hadoop Clusters in the Cloud
Scalable Hadoop clusters as a service
Hadoop, Hive, Spark, Presto, Hbase, etc.
Easy to use; fully managed
On demand, reserved, spot pricing
HDFS, Amazon EBS, and S3 filesystems
End to end security
Amazon EMR
Easy to deploy
AWS Management Console
or use the EMR API with your favorite SDK
Command Line
Choose your instance types
CPU
C3/C4 family
MACHINE
LEARNING
Memory
R3 family
SPARK AND
INTERACTIVE
Disk/IO
D2/I2 family
LARGE
HDFS
General
M3/M4 family
BATCH
PROCESS
Customize your storage type and size using Amazon EBS
Try different configurations to find your optimal architecture
The Hadoop ecosystem can run in Amazon EMR
Integrated with the AWS Platform
Amazon DynamoDB
EMR-DynamoDB
connector
Amazon RDS
Amazon
Kinesis
Streaming data
connectorsJDBC Data Source
w/ Spark SQL
ElasticSearch
connector
Amazon Redshift
Amazon Redshift Copy
From HDFS
EMR File System
(EMRFS)
Amazon S3
Amazon EMR
Amazon S3 as your persistent data store
Amazon S3
Designed for 99.999999999% durability
Separate compute and storage
Resize and shut down Amazon EMR
clusters with no data loss
Point multiple Amazon EMR clusters
at same data in Amazon S3 using
the EMR File System (EMRFS)
EMRFS makes it easier to leverage Amazon S3
Better performance and error handling options
Transparent to applications – just read/write to “s3://”
Support for Amazon S3 server-side and client-side encryption
Faster listing using EMRFS metadata
HDFS is still available via local instance storage or Amazon EBS
Amazon Redshift
Relational data warehouse
Massively parallel; Petabyte scale
Fully managed
HDD and SSD Platforms
$1,000/TB/Year; start at $0.25/hour
End to end security; built in global DR
Amazon
Redshift
Amazon Redshift dramatically reduces I/O
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
• With row storage you do
unnecessary I/O
• To get total amount, you have to
read everything
Amazon Redshift dramatically reduces I/O
• With column storage, you only
read the data you need
ID Age State Amount
123 20 CA 500
345 25 WA 250
678 40 FL 125
957 37 WA 375
Amazon Redshift dramatically reduces I/O
Column storage
Data compression
• Columnar compression saves
space & reduces I/O
• Amazon Redshift analyzes and
compresses your data
analyze compression listing;
Table | Column | Encoding
---------+----------------+----------
listing | listid | delta
listing | sellerid | delta32k
listing | eventid | delta32k
listing | dateid | bytedict
listing | numtickets | bytedict
listing | priceperticket | delta32k
listing | totalprice | mostly32
listing | listtime | raw
Amazon Redshift dramatically reduces I/O
Column storage
Data compression
• Track of the minimum and
maximum value for each block
• Skip over blocks that don’t
contain the data needed for a
given query
• Minimize unnecessary I/O
Amazon Redshift dramatically reduces I/O
Column storage
Data compression
Zone maps
Direct-attached storage
Large data block sizes
• Use direct-attached storage to
maximize throughput
• Hardware optimized for high
performance data processing
• Large block sizes to make the
most of each read
• Amazon Redshift manages
durability for you
Amazon Redshift Has Security Built In
SSL to secure data in transit
Encryption to secure data at rest
AES-256; hardware accelerated
All blocks on disks and in Amazon S3 encrypted
HSM Support
No direct access to compute nodes
Audit logging, AWS CloudTrail, AWS KMS
integration
Amazon VPC support
SOC 1/2/3, PCI-DSS Level 1, FedRAMP, HIPAA
10 GigE
(HPC)
Ingestion
Backup
Restore
Customer VPC
Internal
VPC
JDBC/ODBC
Building a Big Data Application
Building a Big Data Application
web clients
mobile clients
DBMS
corporate data center
Getting Started
Building a Big Data Application
web clients
mobile clients
DBMS
Amazon Redshift
Amazon
QuickSight
AWS cloudcorporate data center
Adding a data warehouse
Building a Big Data Application
web clients
mobile clients
DBMS
Raw data
Amazon Redshift
Staging Data
Amazon
QuickSight
AWS cloud
Bringing in Log Data
corporate data center
Building a Big Data Application
web clients
mobile clients
DBMS
Raw data
Amazon Redshift
Staging Data
Orc/Parquet
(Query optimized)
Amazon
QuickSight
AWS cloud
Extending your DW to S3
corporate data center
Building a Big Data Application
web clients
mobile clients
DBMS
Raw data
Amazon Redshift
Staging Data
Orc/Parquet
(Query optimized)
Amazon
QuickSight
KinesisStreams
AWS cloud
Adding a real-time layer
corporate data center
Building a Big Data Application
web clients
mobile clients
DBMS
Raw data
Amazon Redshift
Staging Data
Orc/Parquet
(Query optimized)
Amazon
QuickSight
KinesisStreams
AWS cloud
Adding predictive analytics
corporate data center
Building a Big Data Application
web clients
mobile clients
DBMS
Raw data
Amazon Redshift
Staging Data
Orc/Parquet
(Query optimized)
Amazon
QuickSight
KinesisStreams
AWS cloud
Adding encryption at rest with AWS KMS
corporate data center
AWSKMS
Building a Big Data Application
web clients
mobile clients
DBMS
Raw data
Amazon Redshift
Staging Data
Orc/Parquet
(Query optimized)
Amazon
QuickSight
KinesisStreams
AWS cloud
AWSKMS
VPC subnet
SSL/TLS
SSL/TLS
Protecting Data in Transit & Adding Network Isolation
corporate data center
Security
• Encryption at rest with choice of key management
• Service managed, AWS KMS, CloudHSM, on premise HSM
• Encryption in Transit
• Require SSL, all internal communication over SSL/TLS
• Network isolation using Amazon VPC
• Fine grained permissions and auditing using AWS IAM
and AWS CloudTrail
Compliance
ISO 9001
SOC 3
SOC 2
ISO 27001
ISO 27017
PCI DSS Level 1ISO 27018
SOC 1 / ISAE 3402
GxPHIPAA
ITAR
FERPA
FISMA, RMF, and DIACAP
FedRAMP
Section 508 / VPAT
DoD SRG Levels 2 & 4
FIPS 140-2
CJIS
Cloud Security Alliance
MPAA
NIST
MLPS Level 3
G-Cloud
IT-Grundschutz
MTCS Tier 3
IRAP Cyber Essentials Plus
Disaster Recovery
• Amazon EMR & Amazon Redshift clusters are resilient
and we automatically replace failed nodes/HW
• Data on S3 available in all Availability Zones in a Region
• S3 data can be synced across regions
• Amazon Redshift clusters are continuously backed up to
S3 and snapshots can be synced to a second region
Customer Use Cases
Data Source ET
Direct
Connect
Client
Forwarder
LoaderState Management
SandboxRedshift
S3
Petabytes of data generated
on premise and brought to
Redshift in the cloud for
analysis
High speed connectivity over a
redundant pair of Direct
Connect leased lines
Stringent security requirements
met by leveraging VPC, VPN,
Encryption and Rest and In
Transit, CloudTrail and
database auditing
NTT DOCOMO
Nasdaq – Legacy Warehouse
Expensive ($1.16M annually)
Limited capacity (1 year of data
online)
4-8 billion rows inserted per
trading day, storing:
• Orders
• Trades
• Quotes
• Market Data
• Security Master
• Membership
DW can be used to analyze market
share, client activity, surveillance,
power our billing, and more…
Nasdaq Architecture
On premise AWS Regional (Multi-AZ) Scope AWS (US-East,
primary AZ/VPC)
S3
SNS
Redshift
Database
Cluster
HSM Key
Appliance
Cluster
MySQL
Redshift
Load files/
Manifests
Redshift
Snapshots/
Backups
Data
Loaded
Topic
RMS Input
Sources
(multiple
systems)
Data Ingest
Process
SmartNews
Useful Resources
• AWS Big Data Blog
• Re:Invent 2015 Big Data Sessions
• AWS Marketplace for Big Data Solutions
• Amazon Big Data Partners
Summary
• AWS enables you to build sophisticated big data applications
• Retrospective, Real-time, Predictive
• You can build incrementally, adding use cases and increasing
scale as you go
• AWS provides a broad range of security and auditing features
to enable you to meet your security requirements
• AWS makes it easy to build hybrid applications that span
across your datacenters and the AWS Cloud
Thank you!
@rahulpathak

More Related Content

What's hot

A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionAmazon Web Services
 
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Amazon Web Services
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...
Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...
Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...Amazon Web Services
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
Co 4, session 2, aws analytics services
Co 4, session 2, aws analytics servicesCo 4, session 2, aws analytics services
Co 4, session 2, aws analytics servicesm vaishnavi
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...Amazon Web Services
 
Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsAmazon Web Services
 
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAmazon Web Services
 
BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012Amazon Web Services
 
Getting Started with Amazon QuickSight
Getting Started with Amazon QuickSightGetting Started with Amazon QuickSight
Getting Started with Amazon QuickSightAmazon Web Services
 
利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料Amazon Web Services
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Amazon Web Services
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)Amazon Web Services
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsAmazon Web Services
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAmazon Web Services
 

What's hot (20)

A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
Managing Data with Amazon ElastiCache for Redis - August 2016 Monthly Webinar...
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
 
Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...
Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...
Build a Real-time Streaming Data Visualization System with Amazon Kinesis Ana...
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
 
Co 4, session 2, aws analytics services
Co 4, session 2, aws analytics servicesCo 4, session 2, aws analytics services
Co 4, session 2, aws analytics services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
AWS re:Invent 2016: Migrating a Highly Available and Scalable Database from O...
 
Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics Workloads
 
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS CloudAWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
AWS Data Transfer Services: Data Ingest Strategies Into the AWS Cloud
 
BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012BDT201 AWS Data Pipeline - AWS re: Invent 2012
BDT201 AWS Data Pipeline - AWS re: Invent 2012
 
Getting Started with Amazon QuickSight
Getting Started with Amazon QuickSightGetting Started with Amazon QuickSight
Getting Started with Amazon QuickSight
 
利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料
 
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
Streaming data analytics (Kinesis, EMR/Spark) - Pop-up Loft Tel Aviv
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
AWS re:Invent 2016: Automating Workflows for Analytics Pipelines (DEV401)
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 

Viewers also liked

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Getting Started with AWS IoT - September 2016 Webinar Series
Getting Started with AWS IoT - September 2016 Webinar SeriesGetting Started with AWS IoT - September 2016 Webinar Series
Getting Started with AWS IoT - September 2016 Webinar SeriesAmazon Web Services
 
Build A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million UsersBuild A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million UsersAmazon Web Services
 
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar SeriesContinuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar SeriesAmazon Web Services
 
Introducing “Well-Architected” For Developers - Technical 101
Introducing “Well-Architected” For Developers - Technical 101Introducing “Well-Architected” For Developers - Technical 101
Introducing “Well-Architected” For Developers - Technical 101Amazon Web Services
 
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Michael Bohlig
 
Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...
Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...
Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...Spark Summit
 
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big DataAmazon Web Services
 
(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWSAmazon Web Services
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with ExamplesJoe McTee
 
Big data with amazon EMR - Pop-up Loft Tel Aviv
Big data with amazon EMR - Pop-up Loft Tel AvivBig data with amazon EMR - Pop-up Loft Tel Aviv
Big data with amazon EMR - Pop-up Loft Tel AvivAmazon Web Services
 
Implementing analytics? You need decision modeling and business rules
Implementing analytics? You need decision modeling and business rulesImplementing analytics? You need decision modeling and business rules
Implementing analytics? You need decision modeling and business rulesDecision Management Solutions
 
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012Amazon Web Services
 
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)Amazon Web Services
 
Big Data and Analytics – End to End on AWS – Russell Nash
Big Data and Analytics – End to End on AWS – Russell NashBig Data and Analytics – End to End on AWS – Russell Nash
Big Data and Analytics – End to End on AWS – Russell NashAmazon Web Services
 
Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015Amazon Web Services
 
Big Data & Analytics: End to End on AWS - Technical 101
Big Data & Analytics: End to End on AWS - Technical 101Big Data & Analytics: End to End on AWS - Technical 101
Big Data & Analytics: End to End on AWS - Technical 101Amazon Web Services
 

Viewers also liked (20)

2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days
 
Big Data and Analytics on AWS
Big Data and Analytics on AWS Big Data and Analytics on AWS
Big Data and Analytics on AWS
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with AWS IoT - September 2016 Webinar Series
Getting Started with AWS IoT - September 2016 Webinar SeriesGetting Started with AWS IoT - September 2016 Webinar Series
Getting Started with AWS IoT - September 2016 Webinar Series
 
Build A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million UsersBuild A Website on AWS for Your First 10 Million Users
Build A Website on AWS for Your First 10 Million Users
 
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar SeriesContinuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
 
Introducing “Well-Architected” For Developers - Technical 101
Introducing “Well-Architected” For Developers - Technical 101Introducing “Well-Architected” For Developers - Technical 101
Introducing “Well-Architected” For Developers - Technical 101
 
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
 
Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...
Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...
Using Data Science to Transform OpenTable Into Your Local Dining Expert-(Pabl...
 
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data
 
Bv asw presentation
Bv asw presentationBv asw presentation
Bv asw presentation
 
(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS(BDT205) Your First Big Data Application On AWS
(BDT205) Your First Big Data Application On AWS
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
 
Big data with amazon EMR - Pop-up Loft Tel Aviv
Big data with amazon EMR - Pop-up Loft Tel AvivBig data with amazon EMR - Pop-up Loft Tel Aviv
Big data with amazon EMR - Pop-up Loft Tel Aviv
 
Implementing analytics? You need decision modeling and business rules
Implementing analytics? You need decision modeling and business rulesImplementing analytics? You need decision modeling and business rules
Implementing analytics? You need decision modeling and business rules
 
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
 
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
 
Big Data and Analytics – End to End on AWS – Russell Nash
Big Data and Analytics – End to End on AWS – Russell NashBig Data and Analytics – End to End on AWS – Russell Nash
Big Data and Analytics – End to End on AWS – Russell Nash
 
Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015
 
Big Data & Analytics: End to End on AWS - Technical 101
Big Data & Analytics: End to End on AWS - Technical 101Big Data & Analytics: End to End on AWS - Technical 101
Big Data & Analytics: End to End on AWS - Technical 101
 

Similar to AWS March 2016 Webinar Series - Building Big Data Solutions with Amazon EMR and Amazon Redshift

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseData Con LA
 
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...Amazon Web Services
 
Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Amazon Web Services
 
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介Amazon Web Services Japan
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...Amazon Web Services
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - DatalakeLam Le
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesAmazon Web Services
 
Loading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SFLoading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SFAmazon Web Services
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSAmazon Web Services
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
Database Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS SummitDatabase Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS SummitAmazon Web Services
 
Loading Data into Amazon Redshift
Loading Data into Amazon RedshiftLoading Data into Amazon Redshift
Loading Data into Amazon RedshiftAmazon Web Services
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Amazon Web Services
 
(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDS
(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDS(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDS
(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDSAmazon Web Services
 
Loading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF LoftLoading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF LoftAmazon Web Services
 

Similar to AWS March 2016 Webinar Series - Building Big Data Solutions with Amazon EMR and Amazon Redshift (20)

Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
 
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
Migrating Financial and Accounting Systems from Oracle to Amazon DynamoDB (DA...
 
Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018Success has Many Query Engines- Tel Aviv Summit 2018
Success has Many Query Engines- Tel Aviv Summit 2018
 
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS Updates
 
Loading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SFLoading Data into Redshift: Data Analytics Week SF
Loading Data into Redshift: Data Analytics Week SF
 
Getting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWSGetting Started with Managed Database Services on AWS
Getting Started with Managed Database Services on AWS
 
AWS Analytics
AWS AnalyticsAWS Analytics
AWS Analytics
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Database Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS SummitDatabase Freedom - ADB304 - Santa Clara AWS Summit
Database Freedom - ADB304 - Santa Clara AWS Summit
 
Loading Data into Amazon Redshift
Loading Data into Amazon RedshiftLoading Data into Amazon Redshift
Loading Data into Amazon Redshift
 
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
Using data lakes to quench your analytics fire - AWS Summit Cape Town 2018
 
(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDS
(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDS(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDS
(DAT209) NEW LAUNCH! Introducing MariaDB on Amazon RDS
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Loading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF LoftLoading Data into Redshift: Data Analytics Week at the SF Loft
Loading Data into Redshift: Data Analytics Week at the SF Loft
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

AWS March 2016 Webinar Series - Building Big Data Solutions with Amazon EMR and Amazon Redshift

  • 1. © 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rahul Pathak, Amazon EMR March 30, 2016 Building Big Data Solutions with Amazon EMR & Amazon Redshift
  • 2. Agenda • AWS Big Data Platform Overview • Amazon EMR & Amazon Redshift • Building a Big Data Application • Customer Use Cases
  • 3. AWS Big Data Platform EMR EC2 Analyze Glacier S3 Store Import Export Collect Kinesis Direct Connect Machine Learning Redshift Amazon QuickSight DynamoDB
  • 4. Amazon EMR – Managed Hadoop Clusters in the Cloud Scalable Hadoop clusters as a service Hadoop, Hive, Spark, Presto, Hbase, etc. Easy to use; fully managed On demand, reserved, spot pricing HDFS, Amazon EBS, and S3 filesystems End to end security Amazon EMR
  • 5. Easy to deploy AWS Management Console or use the EMR API with your favorite SDK Command Line
  • 6. Choose your instance types CPU C3/C4 family MACHINE LEARNING Memory R3 family SPARK AND INTERACTIVE Disk/IO D2/I2 family LARGE HDFS General M3/M4 family BATCH PROCESS Customize your storage type and size using Amazon EBS Try different configurations to find your optimal architecture
  • 7. The Hadoop ecosystem can run in Amazon EMR
  • 8. Integrated with the AWS Platform Amazon DynamoDB EMR-DynamoDB connector Amazon RDS Amazon Kinesis Streaming data connectorsJDBC Data Source w/ Spark SQL ElasticSearch connector Amazon Redshift Amazon Redshift Copy From HDFS EMR File System (EMRFS) Amazon S3 Amazon EMR
  • 9. Amazon S3 as your persistent data store Amazon S3 Designed for 99.999999999% durability Separate compute and storage Resize and shut down Amazon EMR clusters with no data loss Point multiple Amazon EMR clusters at same data in Amazon S3 using the EMR File System (EMRFS)
  • 10. EMRFS makes it easier to leverage Amazon S3 Better performance and error handling options Transparent to applications – just read/write to “s3://” Support for Amazon S3 server-side and client-side encryption Faster listing using EMRFS metadata HDFS is still available via local instance storage or Amazon EBS
  • 11. Amazon Redshift Relational data warehouse Massively parallel; Petabyte scale Fully managed HDD and SSD Platforms $1,000/TB/Year; start at $0.25/hour End to end security; built in global DR Amazon Redshift
  • 12. Amazon Redshift dramatically reduces I/O ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375 • With row storage you do unnecessary I/O • To get total amount, you have to read everything
  • 13. Amazon Redshift dramatically reduces I/O • With column storage, you only read the data you need ID Age State Amount 123 20 CA 500 345 25 WA 250 678 40 FL 125 957 37 WA 375
  • 14. Amazon Redshift dramatically reduces I/O Column storage Data compression • Columnar compression saves space & reduces I/O • Amazon Redshift analyzes and compresses your data analyze compression listing; Table | Column | Encoding ---------+----------------+---------- listing | listid | delta listing | sellerid | delta32k listing | eventid | delta32k listing | dateid | bytedict listing | numtickets | bytedict listing | priceperticket | delta32k listing | totalprice | mostly32 listing | listtime | raw
  • 15. Amazon Redshift dramatically reduces I/O Column storage Data compression • Track of the minimum and maximum value for each block • Skip over blocks that don’t contain the data needed for a given query • Minimize unnecessary I/O
  • 16. Amazon Redshift dramatically reduces I/O Column storage Data compression Zone maps Direct-attached storage Large data block sizes • Use direct-attached storage to maximize throughput • Hardware optimized for high performance data processing • Large block sizes to make the most of each read • Amazon Redshift manages durability for you
  • 17. Amazon Redshift Has Security Built In SSL to secure data in transit Encryption to secure data at rest AES-256; hardware accelerated All blocks on disks and in Amazon S3 encrypted HSM Support No direct access to compute nodes Audit logging, AWS CloudTrail, AWS KMS integration Amazon VPC support SOC 1/2/3, PCI-DSS Level 1, FedRAMP, HIPAA 10 GigE (HPC) Ingestion Backup Restore Customer VPC Internal VPC JDBC/ODBC
  • 18. Building a Big Data Application
  • 19. Building a Big Data Application web clients mobile clients DBMS corporate data center Getting Started
  • 20. Building a Big Data Application web clients mobile clients DBMS Amazon Redshift Amazon QuickSight AWS cloudcorporate data center Adding a data warehouse
  • 21. Building a Big Data Application web clients mobile clients DBMS Raw data Amazon Redshift Staging Data Amazon QuickSight AWS cloud Bringing in Log Data corporate data center
  • 22. Building a Big Data Application web clients mobile clients DBMS Raw data Amazon Redshift Staging Data Orc/Parquet (Query optimized) Amazon QuickSight AWS cloud Extending your DW to S3 corporate data center
  • 23. Building a Big Data Application web clients mobile clients DBMS Raw data Amazon Redshift Staging Data Orc/Parquet (Query optimized) Amazon QuickSight KinesisStreams AWS cloud Adding a real-time layer corporate data center
  • 24. Building a Big Data Application web clients mobile clients DBMS Raw data Amazon Redshift Staging Data Orc/Parquet (Query optimized) Amazon QuickSight KinesisStreams AWS cloud Adding predictive analytics corporate data center
  • 25. Building a Big Data Application web clients mobile clients DBMS Raw data Amazon Redshift Staging Data Orc/Parquet (Query optimized) Amazon QuickSight KinesisStreams AWS cloud Adding encryption at rest with AWS KMS corporate data center AWSKMS
  • 26. Building a Big Data Application web clients mobile clients DBMS Raw data Amazon Redshift Staging Data Orc/Parquet (Query optimized) Amazon QuickSight KinesisStreams AWS cloud AWSKMS VPC subnet SSL/TLS SSL/TLS Protecting Data in Transit & Adding Network Isolation corporate data center
  • 27. Security • Encryption at rest with choice of key management • Service managed, AWS KMS, CloudHSM, on premise HSM • Encryption in Transit • Require SSL, all internal communication over SSL/TLS • Network isolation using Amazon VPC • Fine grained permissions and auditing using AWS IAM and AWS CloudTrail
  • 28. Compliance ISO 9001 SOC 3 SOC 2 ISO 27001 ISO 27017 PCI DSS Level 1ISO 27018 SOC 1 / ISAE 3402 GxPHIPAA ITAR FERPA FISMA, RMF, and DIACAP FedRAMP Section 508 / VPAT DoD SRG Levels 2 & 4 FIPS 140-2 CJIS Cloud Security Alliance MPAA NIST MLPS Level 3 G-Cloud IT-Grundschutz MTCS Tier 3 IRAP Cyber Essentials Plus
  • 29. Disaster Recovery • Amazon EMR & Amazon Redshift clusters are resilient and we automatically replace failed nodes/HW • Data on S3 available in all Availability Zones in a Region • S3 data can be synced across regions • Amazon Redshift clusters are continuously backed up to S3 and snapshots can be synced to a second region
  • 31. Data Source ET Direct Connect Client Forwarder LoaderState Management SandboxRedshift S3 Petabytes of data generated on premise and brought to Redshift in the cloud for analysis High speed connectivity over a redundant pair of Direct Connect leased lines Stringent security requirements met by leveraging VPC, VPN, Encryption and Rest and In Transit, CloudTrail and database auditing NTT DOCOMO
  • 32. Nasdaq – Legacy Warehouse Expensive ($1.16M annually) Limited capacity (1 year of data online) 4-8 billion rows inserted per trading day, storing: • Orders • Trades • Quotes • Market Data • Security Master • Membership DW can be used to analyze market share, client activity, surveillance, power our billing, and more…
  • 33. Nasdaq Architecture On premise AWS Regional (Multi-AZ) Scope AWS (US-East, primary AZ/VPC) S3 SNS Redshift Database Cluster HSM Key Appliance Cluster MySQL Redshift Load files/ Manifests Redshift Snapshots/ Backups Data Loaded Topic RMS Input Sources (multiple systems) Data Ingest Process
  • 35. Useful Resources • AWS Big Data Blog • Re:Invent 2015 Big Data Sessions • AWS Marketplace for Big Data Solutions • Amazon Big Data Partners
  • 36. Summary • AWS enables you to build sophisticated big data applications • Retrospective, Real-time, Predictive • You can build incrementally, adding use cases and increasing scale as you go • AWS provides a broad range of security and auditing features to enable you to meet your security requirements • AWS makes it easy to build hybrid applications that span across your datacenters and the AWS Cloud