SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Michalis Petropoulos
Engineering Manager, Amazon Redshift
Greg Rokita
Executive Director, Edmunds
BDA306
Building a Modern Data Warehouse:
Deep Dive on Amazon Redshift
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Analytics Portfolio
Collect Store Analyze
Amazon Kinesis
Data Firehose
AWS Direct
Connect
Amazon
Snowball
Amazon Kinesis
Data Analytics
Amazon Kinesis
Data Streams
Amazon S3 Amazon Glacier
Amazon
CloudSearch
Amazon RDS,
Amazon Aurora
Amazon
DynamoDB
Amazon ES
Amazon EMR
Amazon
Redshift
Amazon
QuickSight
AWS Database Migration Service AWS Glue
Amazon Athena
Amazon AI
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift
10x faster at 1/10th the cost
Fast
Delivers fast results for all
types of workloads
Cost-effective
No upfront costs, start small,
and pay as you go
Integrated Secure
Audit everything; encrypt
data end-to-end; extensive
certification and compliance
Integrated with Amazon S3
data lakes, AWS services,
and third-party tools
$
Simple
Create and start using a data
warehouse in minutes
Scalable
Gigabytes to petabytes
to exabytes
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Redshift Spectrum
Extend the data warehouse to your Amazon S3 data lake
Scale compute and storage separately
Join data across Amazon Redshift and S3
Exabyte-scale Amazon Redshift SQL queries against S3
Stable query performance and unlimited concurrency
Parquet, ORC, JSON, Grok, Avro, & CSV formats
Pay only for the amount of data scanned
S3 data lakeAmazon Redshift
data
Redshift Spectrum
query engine
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift
JDBC/ODBC
...
1 2 3 4 N
Amazon S3
Exabyte-scale object storage
AWS Glue
Data Catalog
Redshift Spectrum
Scale-out serverless compute
Query
SELECT COUNT(*)
FROM S3.EXT_TABLE
GROUP BY …
Amazon Redshift
Architecture
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thousands of Companies Run Mission Critical
Workloads on Amazon Redshift
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester Research, Inc. The Forrester Wave™ is a graphical
representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor,
product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change.
“Amazon Redshift has the largest adoption
of BDW in the cloud.”
“With more than 5,000 deployments, Amazon
Redshift has the largest data warehouse
deployments in the cloud – some over 10
petabytes in size.”
AWS received a score of 5/5 (the highest
score possible) in the: customer base,
market awareness, ability to execute, road
map, support, and partners criteria
Forrester Wave Big Data Warehouse Q2 2017
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift is Widely Available
Ireland
Frankfurt
London
Paris
Beijing
Mumbai
Seoul
Singapore
Sydney
Tokyo
Osaka
Sao Paulo
US East – N Virginia
US East – Ohio
US West – Oregon
US West – N California
AWS GovCloud (US)
Canada – Central, Montreal
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Selected Amazon Redshift Partners
Data Integration Systems IntegratorsBusiness Intelligence
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Recently Released Features
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer Comments
“We have terabytes of event data coming from our websites and applications to Amazon
S3 and then to Amazon Redshift in near real-time. Redshift is at the core of our
operations and used by our marketing automation tools,” said Jarno Kartela, Head of
Analytics and Chief Data Scientist, DNA. “We can now run queries in half the time.”
“Redshift allows us to quickly spin up clusters and provide our data
scientists with a fast and easy method to access data and generate
insights,” said Bradley Todd, Liberty Mutual’s Technology Architect.
“We saw a 9x reduction in month-end reporting time with Redshift
DC2 nodes as compared to DC1."
Finnish
Telecom
Service
Provider
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Dense Compute Nodes (DC2)
2x performance at the same price as DC1
3x more I/O with
30% better storage utilization
than DC1
“Amazon Redshift’s new DC2 node is giving
us a 100 percent performance increase,
allowing us to provide faster insights for our
retailers, more cost effectively, to drive
incremental revenue."
NVMe SSD DDR4 memory
Intel E5-2686 v4 (Broadwell)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Short Query Acceleration
Express Lane for Short Queries
• Short queries do not get stuck behind long running queries
• Higher throughput – Less variability
• Adapts to your workload
• Transparent – it just works!
Average Queue Time for Short Queries (<1sec)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Short Query Acceleration
Express Lane for Short Queries
• Machine learning predicts
the runtime of queries
• Short queries are routed to
an express queue
• Resources are dynamically
dedicated to short queries
• Enable it today from your
AWS Management Console
• Coming soon: Dynamic
timeout based on workload
How it works:
Analytics and
BI / Dashboard tools
Amazon
Redshift Machine Learning
Classifier
Machine learning
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BI / Dashboard tools
Analytics and
Amazon
Redshift
Queries go to the leader node1
If the cache contains the query result,
it is returned with no processing
2
If the query result is not in cache, it is
executed, and the result is cached
3RESULTS CACHE
QUERY_ID RESULT
QUERY_ID RESULT
Result-set Caching
Subsecond repeat queries
How it works:
Result
cache
Caching frees up the Amazon Redshift cluster,
increasing performance for all queries
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Result-set Caching
Subsecond repeat queries
• Amazon Redshift customers can now serve 35% more queries on average,
using the same compute resources
• Tens of thousands of compute hours are freed up daily to serve the
remaining queries and data ingestion
• Transparent – it just works!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Commit Enhancements
50% faster data commits for busy clusters
16% faster data ingestion and insertion
Commit Duration Per Transaction for Busy Clusters
Nov Jan Mar
Total Commit Time by Month
ds2.8xlarge, cluster size: 10 and up, us-west-2
Clusters with more than 90 backups a day
p99 p95 p90 p50 Linear (p99)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Query Performance Improvements
• Faster hash joins
• Improvements to hash algorithm (Jan '18)
• Significant improvement in memory utilization (Feb '18)
• Cache line prefetching to improve join performance (Mar '18)
• Join-intensive workloads like TPC-H and TPC-DS show a performance
improvement ranging from 28% to 2x for several queries
• 64x reduction of memory footprint fleet wide for hash joins and aggregations.
Significant improvement to overall throughput
• Read and write queries can now hop WLM queues without restarting
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Redshift Spectrum Enhancements
• Available in 14 AWS Regions
• Added support for processing scalar JSON and ION file formats in S3
• In addition to Parquet, ORC, Avro, CSV, Grok, RCFile, RegexSerDe,
OpenCSV, SequenceFile, TextFile, and TSV
• Support for DATE data type
• Support for IAM role-chaining to assume cross-account roles
• Coming Soon: COPY from Parquet, ORC, RCFile, and Sequence files
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Coming soon: Nested Data Support
• Analyze nested and semi-structured data in Amazon S3 with Spectrum
• Allows easy ETL of nested data in to Amazon Redshift using CTAS
• Support for open file formats: Parquet, ORC, JSON, and Ion
• Uses dot notation to extend your existing SQL
s3data.clickStream: <<
{ “session_time”: “20171013 14:05:00”,
“clicks”: [ {“page”: “/home”, “referrer”: “”},
{“page”: “/products”, “referrer”: “/home”} ]
},
{ “session_time”: “20171013 14:06:00”,
“clicks”: [ {“page”: “/contact”, “referrer”: “/home”} ]
} >>
SELECT c.page,
COUNT(*) AS count
FROM s3data.clickStream s,
s.clicks c
WHERE s.session_time > ‘2017-10-01 00:00:00’
AND c.referrer = “/home”
GROUP BY c.page;
Example: Find click frequency for links on “/home”:
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Coming soon: Nested Data Support
Improve query performance by analyzing nested data
OrderID CustomerID OrderTime ShipMode
5 23 10.00 12.50
8 32 1.00 5.60
OrdersWithItems
ItemID Quantity Price
23 10.00 12.50
16 1.00 1.99
32 1.00 5.60
24 5.00 26.50
OrderItems
OrderID ItemID Quantity Price
5 23 10.00 12.50
8 32 1.00 5.60
5 16 1.00 1.99
8 24 5.00 26.50
OrderID CustomerID OrderTime ShipMode
5 23 10.00 12.50
8 32 1.00 5.60
Orders
OrderItems
To improve query
performance, the
new Orders table
includes the
OrdersWithItems as
a nested column,
eliminating join
processing
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift is Self-Healing
Machine-learning based prediction and
remediation of degraded disks, nodes and
network
Ensure overall cluster and query performance
Amazon
Redshift
... ... ...
becoming
data-driven
who is edmunds?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
greg rokita
Exec Director, Technology | M.S. in Computer Science | Founder
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
our history
1966
Edmunds is
incorporated after
being founded by
Louis Arons and
Michael Mayor,
publishers of New
Car Prices and
Used Car Prices.
1988
Peter Steinlauf
buys the company
with AJA Holding
Corp. Edmunds
has 9 publications
(7 automotive-
related) selling for
$3.95 each.
1995
Edmunds is first to
publish car info on
the internet. It
evolves into the
very first
automotive
information website
— before any
carmaker has one.
2007
Edmunds launches
Dealer Ratings
& Reviews.
2014
Edmunds
introduces a
proprietary
messaging
platform. Now car
shoppers
can text
dealers directly.
We call it CarCode.
2017
Edmunds unveils
its new brand
ecosystem and
voice. Launches a
re-imagined
customer-centered
site with new
content. And forms
new partnerships
with a suite of new
digital marketing
services.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
transformations
becoming
data driven
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
batch data access
Now - Data Warehouse
• Structured, Processed
• Schema-on-write
• Expensive for large
volume of data
• Less agile
• Used by business
Future - Data Lake
• Structured,
Semi-structured, raw
• Schema-on-read
• Designed for low cost
storage
• Highly agile
• Used by data scientists
Trends
• Processing power and
storage getting cheaper
• More use of data by
Data Scientists
• Volume of unstructured
data is increasing
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
evolution of data warehousing — 2008
• High development cost
• Cannot scale easily / costly
• No separation of processing and access
• Hard to find talent
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
evolution of data warehousing — 2011
Processing
• Can scale and inexpensive
• Talent issue somehow
resolved
Data Access
• Expensive
• High maintenance
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
evolution of data warehousing — 2014
Wins
• Cost efficient data access
• Low maintenance
Challenges
• Data and storage tightly
coupled
• No differential SLAs
• Low flexibility
Amazon
Redshift
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
IT-centric view of data
Website/Apps Data Services/APIs Third-Party Data
Data
Warehouse
Third-Party Data
Analytics EAS
Feed Import
Load
Load
Read
Track
Feed
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
data-driven
approach
Website/Apps
Analytics APIs
Data Engineering
Third-Party Data
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
blueprint
Reporting
Last Mile
Processing
Engine 1
Cluster A
Engine 1
Cluster B
Visual Analytics Transformations
Engine 2
Cluster
Engine 3
Cluster
Use Case Layer
Query/Processing
Engine Layer
Data Layer S3 Data (Parquet)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Edmunds — AWS
Beta Customer
Amazon Redshift
Copy from Parquet
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
commit performance optimizations, month over
month
wins
39
8xlarge instances
Performance for mission-critical workloads
Result-set Caching
20% of queries under 1 sec
Commit Performance Optimizations
+50% Speedup overall, 1000s of hourly jobs
Workflow Management
Manage priorities within workloads
Short Query Acceleration
Automated prioritization for ad-hoc queries
Amazon
Redshift
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
data warehouse vs data engineering
Use Case Layer
Ad Hoc
Analytics
Reporting Real-Time Apps
Metadata
Management
Data
Science
Legacy
Map-Reduce
Query/Processing
Engine Layer
Redshift Redshift
Spark
(Streaming)
Scala/Java
AWS Glue PySpark EMR
Data Layer S3 Data (Parquet)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
41
thank you
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Please complete the session survey in the
summit mobile app.
Submit Session Feedback
1. Tap the Schedule icon. 2. Select the session
you attended.
3. Tap Session Evaluation
to submit your feedback.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Find out more: https://aws.amazon.com/redshift/
Try Amazon Redshift
Get help with your Proof-of-Concept
Read Amazon Redshift blog articles:
https://aws.amazon.com/redshift/blog-posts/
Get Started With Amazon Redshift
Amazon
Redshift

Contenu connexe

Tendances

Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentialsqureshihamid
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeKent Graziano
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mark Kromer
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesNishith Agarwal
 
글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)
글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)
글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)Amazon Web Services Korea
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon Web Services
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Amazon Web Services
 
Best Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWSBest Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWSAmazon Web Services
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data FactorySlava Kokaev
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceSnowflake Computing
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 
Snowflake free trial_lab_guide
Snowflake free trial_lab_guideSnowflake free trial_lab_guide
Snowflake free trial_lab_guideslidedown1
 
データ分析基盤を支えるエンジニアリング
データ分析基盤を支えるエンジニアリングデータ分析基盤を支えるエンジニアリング
データ分析基盤を支えるエンジニアリングRecruit Lifestyle Co., Ltd.
 

Tendances (20)

Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
 
Hudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilitiesHudi architecture, fundamentals and capabilities
Hudi architecture, fundamentals and capabilities
 
글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)
글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)
글로벌 기업들의 효과적인 데이터 분석을 위한 Data Lake 구축 및 분석 사례 - 김준형 (AWS 솔루션즈 아키텍트)
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best Practices
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
 
Best Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWSBest Practices for Building Your Data Lake on AWS
Best Practices for Building Your Data Lake on AWS
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Modern Data Platform on AWS
Modern Data Platform on AWSModern Data Platform on AWS
Modern Data Platform on AWS
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
Snowflake free trial_lab_guide
Snowflake free trial_lab_guideSnowflake free trial_lab_guide
Snowflake free trial_lab_guide
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
データ分析基盤を支えるエンジニアリング
データ分析基盤を支えるエンジニアリングデータ分析基盤を支えるエンジニアリング
データ分析基盤を支えるエンジニアリング
 

Similaire à BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift

Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Web Services
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Amazon Web Services
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftAmazon Web Services
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfAmazon Web Services
 
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...Amazon Web Services
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeAmazon Web Services
 
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAnalyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAmazon Web Services
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETLAmazon Web Services
 
Migrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data LakeMigrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data LakeAmazon Web Services
 
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Amazon Web Services
 
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Amazon Web Services
 
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoImmersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoAmazon Web Services LATAM
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Amazon Web Services
 
AWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAmazon Web Services
 
NetApp Cloud Data Services & AWS Empower Your Cloud Champions
NetApp Cloud Data Services & AWS Empower Your Cloud ChampionsNetApp Cloud Data Services & AWS Empower Your Cloud Champions
NetApp Cloud Data Services & AWS Empower Your Cloud ChampionsAmazon Web Services
 
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeAmazon Web Services
 
Data Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyData Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyAmazon Web Services
 
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...Amazon Web Services
 

Similaire à BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift (20)

Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
 
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
Leadership Session: AWS Database and Analytics (DAT206-L) - AWS re:Invent 2018
 
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon RedshiftBuilding a Modern Data Warehouse - Deep Dive on Amazon Redshift
Building a Modern Data Warehouse - Deep Dive on Amazon Redshift
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdf
 
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
Building a Modern Data Warehouse: Deep Dive on Amazon Redshift - SRV337 - Chi...
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_Singapore
 
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech TalksAnalyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
Analyze your Data Lake, Fast @ Any Scale - AWS Online Tech Talks
 
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 Citrix Moves Data to Amazon Redshift Fast with Matillion ETL Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
Citrix Moves Data to Amazon Redshift Fast with Matillion ETL
 
Migrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data LakeMigrating your traditional Data Warehouse to a Modern Data Lake
Migrating your traditional Data Warehouse to a Modern Data Lake
 
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
Modern Cloud Data Warehousing ft. Intuit: Optimize Analytics Practices (ANT20...
 
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
Database Freedom. Database migration approaches to get to the Cloud - Marcus ...
 
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analíticoImmersion Day - Como simplificar o acesso ao seu ambiente analítico
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
 
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
Modern Cloud Data Warehousing ft. Equinox Fitness Clubs: Optimize Analytics P...
 
AWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scale
 
NetApp Cloud Data Services & AWS Empower Your Cloud Champions
NetApp Cloud Data Services & AWS Empower Your Cloud ChampionsNetApp Cloud Data Services & AWS Empower Your Cloud Champions
NetApp Cloud Data Services & AWS Empower Your Cloud Champions
 
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data LakeABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
ABD327_Migrating Your Traditional Data Warehouse to a Modern Data Lake
 
Data_Analytics_and_AI_ML
Data_Analytics_and_AI_MLData_Analytics_and_AI_ML
Data_Analytics_and_AI_ML
 
Data Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit SydneyData Warehousing in the Cloud - AWS Summit Sydney
Data Warehousing in the Cloud - AWS Summit Sydney
 
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Dernier

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Dernier (20)

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift

  • 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Michalis Petropoulos Engineering Manager, Amazon Redshift Greg Rokita Executive Director, Edmunds BDA306 Building a Modern Data Warehouse: Deep Dive on Amazon Redshift
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Analytics Portfolio Collect Store Analyze Amazon Kinesis Data Firehose AWS Direct Connect Amazon Snowball Amazon Kinesis Data Analytics Amazon Kinesis Data Streams Amazon S3 Amazon Glacier Amazon CloudSearch Amazon RDS, Amazon Aurora Amazon DynamoDB Amazon ES Amazon EMR Amazon Redshift Amazon QuickSight AWS Database Migration Service AWS Glue Amazon Athena Amazon AI
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Redshift 10x faster at 1/10th the cost Fast Delivers fast results for all types of workloads Cost-effective No upfront costs, start small, and pay as you go Integrated Secure Audit everything; encrypt data end-to-end; extensive certification and compliance Integrated with Amazon S3 data lakes, AWS services, and third-party tools $ Simple Create and start using a data warehouse in minutes Scalable Gigabytes to petabytes to exabytes
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Redshift Spectrum Extend the data warehouse to your Amazon S3 data lake Scale compute and storage separately Join data across Amazon Redshift and S3 Exabyte-scale Amazon Redshift SQL queries against S3 Stable query performance and unlimited concurrency Parquet, ORC, JSON, Grok, Avro, & CSV formats Pay only for the amount of data scanned S3 data lakeAmazon Redshift data Redshift Spectrum query engine
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Redshift JDBC/ODBC ... 1 2 3 4 N Amazon S3 Exabyte-scale object storage AWS Glue Data Catalog Redshift Spectrum Scale-out serverless compute Query SELECT COUNT(*) FROM S3.EXT_TABLE GROUP BY … Amazon Redshift Architecture
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thousands of Companies Run Mission Critical Workloads on Amazon Redshift
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The Forrester Wave™ is copyrighted by Forrester Research, Inc. Forrester and Forrester Wave™ are trademarks of Forrester Research, Inc. The Forrester Wave™ is a graphical representation of Forrester's call on a market and is plotted using a detailed spreadsheet with exposed scores, weightings, and comments. Forrester does not endorse any vendor, product, or service depicted in the Forrester Wave. Information is based on best available resources. Opinions reflect judgment at the time and are subject to change. “Amazon Redshift has the largest adoption of BDW in the cloud.” “With more than 5,000 deployments, Amazon Redshift has the largest data warehouse deployments in the cloud – some over 10 petabytes in size.” AWS received a score of 5/5 (the highest score possible) in the: customer base, market awareness, ability to execute, road map, support, and partners criteria Forrester Wave Big Data Warehouse Q2 2017
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Redshift is Widely Available Ireland Frankfurt London Paris Beijing Mumbai Seoul Singapore Sydney Tokyo Osaka Sao Paulo US East – N Virginia US East – Ohio US West – Oregon US West – N California AWS GovCloud (US) Canada – Central, Montreal
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Selected Amazon Redshift Partners Data Integration Systems IntegratorsBusiness Intelligence
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Recently Released Features
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customer Comments “We have terabytes of event data coming from our websites and applications to Amazon S3 and then to Amazon Redshift in near real-time. Redshift is at the core of our operations and used by our marketing automation tools,” said Jarno Kartela, Head of Analytics and Chief Data Scientist, DNA. “We can now run queries in half the time.” “Redshift allows us to quickly spin up clusters and provide our data scientists with a fast and easy method to access data and generate insights,” said Bradley Todd, Liberty Mutual’s Technology Architect. “We saw a 9x reduction in month-end reporting time with Redshift DC2 nodes as compared to DC1." Finnish Telecom Service Provider
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Dense Compute Nodes (DC2) 2x performance at the same price as DC1 3x more I/O with 30% better storage utilization than DC1 “Amazon Redshift’s new DC2 node is giving us a 100 percent performance increase, allowing us to provide faster insights for our retailers, more cost effectively, to drive incremental revenue." NVMe SSD DDR4 memory Intel E5-2686 v4 (Broadwell)
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Short Query Acceleration Express Lane for Short Queries • Short queries do not get stuck behind long running queries • Higher throughput – Less variability • Adapts to your workload • Transparent – it just works! Average Queue Time for Short Queries (<1sec)
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Short Query Acceleration Express Lane for Short Queries • Machine learning predicts the runtime of queries • Short queries are routed to an express queue • Resources are dynamically dedicated to short queries • Enable it today from your AWS Management Console • Coming soon: Dynamic timeout based on workload How it works: Analytics and BI / Dashboard tools Amazon Redshift Machine Learning Classifier Machine learning
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. BI / Dashboard tools Analytics and Amazon Redshift Queries go to the leader node1 If the cache contains the query result, it is returned with no processing 2 If the query result is not in cache, it is executed, and the result is cached 3RESULTS CACHE QUERY_ID RESULT QUERY_ID RESULT Result-set Caching Subsecond repeat queries How it works: Result cache Caching frees up the Amazon Redshift cluster, increasing performance for all queries
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Result-set Caching Subsecond repeat queries • Amazon Redshift customers can now serve 35% more queries on average, using the same compute resources • Tens of thousands of compute hours are freed up daily to serve the remaining queries and data ingestion • Transparent – it just works!
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Commit Enhancements 50% faster data commits for busy clusters 16% faster data ingestion and insertion Commit Duration Per Transaction for Busy Clusters Nov Jan Mar Total Commit Time by Month ds2.8xlarge, cluster size: 10 and up, us-west-2 Clusters with more than 90 backups a day p99 p95 p90 p50 Linear (p99)
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Query Performance Improvements • Faster hash joins • Improvements to hash algorithm (Jan '18) • Significant improvement in memory utilization (Feb '18) • Cache line prefetching to improve join performance (Mar '18) • Join-intensive workloads like TPC-H and TPC-DS show a performance improvement ranging from 28% to 2x for several queries • 64x reduction of memory footprint fleet wide for hash joins and aggregations. Significant improvement to overall throughput • Read and write queries can now hop WLM queues without restarting
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Redshift Spectrum Enhancements • Available in 14 AWS Regions • Added support for processing scalar JSON and ION file formats in S3 • In addition to Parquet, ORC, Avro, CSV, Grok, RCFile, RegexSerDe, OpenCSV, SequenceFile, TextFile, and TSV • Support for DATE data type • Support for IAM role-chaining to assume cross-account roles • Coming Soon: COPY from Parquet, ORC, RCFile, and Sequence files
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Coming soon: Nested Data Support • Analyze nested and semi-structured data in Amazon S3 with Spectrum • Allows easy ETL of nested data in to Amazon Redshift using CTAS • Support for open file formats: Parquet, ORC, JSON, and Ion • Uses dot notation to extend your existing SQL s3data.clickStream: << { “session_time”: “20171013 14:05:00”, “clicks”: [ {“page”: “/home”, “referrer”: “”}, {“page”: “/products”, “referrer”: “/home”} ] }, { “session_time”: “20171013 14:06:00”, “clicks”: [ {“page”: “/contact”, “referrer”: “/home”} ] } >> SELECT c.page, COUNT(*) AS count FROM s3data.clickStream s, s.clicks c WHERE s.session_time > ‘2017-10-01 00:00:00’ AND c.referrer = “/home” GROUP BY c.page; Example: Find click frequency for links on “/home”:
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Coming soon: Nested Data Support Improve query performance by analyzing nested data OrderID CustomerID OrderTime ShipMode 5 23 10.00 12.50 8 32 1.00 5.60 OrdersWithItems ItemID Quantity Price 23 10.00 12.50 16 1.00 1.99 32 1.00 5.60 24 5.00 26.50 OrderItems OrderID ItemID Quantity Price 5 23 10.00 12.50 8 32 1.00 5.60 5 16 1.00 1.99 8 24 5.00 26.50 OrderID CustomerID OrderTime ShipMode 5 23 10.00 12.50 8 32 1.00 5.60 Orders OrderItems To improve query performance, the new Orders table includes the OrdersWithItems as a nested column, eliminating join processing
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Redshift is Self-Healing Machine-learning based prediction and remediation of degraded disks, nodes and network Ensure overall cluster and query performance Amazon Redshift ... ... ...
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. greg rokita Exec Director, Technology | M.S. in Computer Science | Founder
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. our history 1966 Edmunds is incorporated after being founded by Louis Arons and Michael Mayor, publishers of New Car Prices and Used Car Prices. 1988 Peter Steinlauf buys the company with AJA Holding Corp. Edmunds has 9 publications (7 automotive- related) selling for $3.95 each. 1995 Edmunds is first to publish car info on the internet. It evolves into the very first automotive information website — before any carmaker has one. 2007 Edmunds launches Dealer Ratings & Reviews. 2014 Edmunds introduces a proprietary messaging platform. Now car shoppers can text dealers directly. We call it CarCode. 2017 Edmunds unveils its new brand ecosystem and voice. Launches a re-imagined customer-centered site with new content. And forms new partnerships with a suite of new digital marketing services.
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. transformations
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. batch data access Now - Data Warehouse • Structured, Processed • Schema-on-write • Expensive for large volume of data • Less agile • Used by business Future - Data Lake • Structured, Semi-structured, raw • Schema-on-read • Designed for low cost storage • Highly agile • Used by data scientists Trends • Processing power and storage getting cheaper • More use of data by Data Scientists • Volume of unstructured data is increasing
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. evolution of data warehousing — 2008 • High development cost • Cannot scale easily / costly • No separation of processing and access • Hard to find talent
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. evolution of data warehousing — 2011 Processing • Can scale and inexpensive • Talent issue somehow resolved Data Access • Expensive • High maintenance
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. evolution of data warehousing — 2014 Wins • Cost efficient data access • Low maintenance Challenges • Data and storage tightly coupled • No differential SLAs • Low flexibility Amazon Redshift
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. IT-centric view of data Website/Apps Data Services/APIs Third-Party Data Data Warehouse Third-Party Data Analytics EAS Feed Import Load Load Read Track Feed
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. data-driven approach Website/Apps Analytics APIs Data Engineering Third-Party Data
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. blueprint Reporting Last Mile Processing Engine 1 Cluster A Engine 1 Cluster B Visual Analytics Transformations Engine 2 Cluster Engine 3 Cluster Use Case Layer Query/Processing Engine Layer Data Layer S3 Data (Parquet)
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Edmunds — AWS Beta Customer Amazon Redshift Copy from Parquet
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. commit performance optimizations, month over month
  • 38. wins 39 8xlarge instances Performance for mission-critical workloads Result-set Caching 20% of queries under 1 sec Commit Performance Optimizations +50% Speedup overall, 1000s of hourly jobs Workflow Management Manage priorities within workloads Short Query Acceleration Automated prioritization for ad-hoc queries Amazon Redshift
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. data warehouse vs data engineering Use Case Layer Ad Hoc Analytics Reporting Real-Time Apps Metadata Management Data Science Legacy Map-Reduce Query/Processing Engine Layer Redshift Redshift Spark (Streaming) Scala/Java AWS Glue PySpark EMR Data Layer S3 Data (Parquet)
  • 40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. 41
  • 41. thank you © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Please complete the session survey in the summit mobile app.
  • 43. Submit Session Feedback 1. Tap the Schedule icon. 2. Select the session you attended. 3. Tap Session Evaluation to submit your feedback.
  • 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Find out more: https://aws.amazon.com/redshift/ Try Amazon Redshift Get help with your Proof-of-Concept Read Amazon Redshift blog articles: https://aws.amazon.com/redshift/blog-posts/ Get Started With Amazon Redshift Amazon Redshift