SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best practices for Running Spark jobs on
Amazon EMR with Spot Instances
Ran Sheinberg
Specialist SA – EC2 Spot
Amazon Web Services
Eyal Lanxner
Chief Technology Officer
Feedvisor
D A T 3 0 3
Daniel Haviv
Specialist SA - Analytics
Amazon Web Services
Anatoli Atamanov
VP Operations & IT
Feedvisor
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Amazon EC2 Spot Instances
• Amazon EMR recap
• Spark best practices
• EMR Instance Fleets with Spot Instances
• Customer story - Feedvisor
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 purchase options
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
$0.27 $0.29$0.50
1b 1c1a
8XL
$0.30 $0.16$0.214XL
$0.07 $0.08$0.082XL
$0.05 $0.04$0.04XL
$0.01 $0.04$0.01L
C4
$1.76
On
Demand
$0.88
$0.44
$0.22
$0.11
EC2 Spot pools - instance type flexibility
Each instance family
Each instance size
Each Availability Zone (61)
In every region (20)
Is a separate Spot pool
R5
M4
C5
I3 M5d
R4 D2
C4
R5d
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Running applications at extreme scale
single HPC cluster of 1 million vCPUs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EC2 Spot integrations
Auto
Scaling
AWS
Batch
Amazon
EMR
AWS Data
Pipeline
Amazon Elastic
Container Service
AWS
CloudFormation
Amazon Elastic
Container Service
for Kubernetes
AWS Thinkbox
Deadline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
No Bidding
Spot is easy
Minimal interruptions <5% Low, Predictable Prices
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
No Bidding
Spot is easy
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pricing Model
New smooth pricing
November 2017
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spot Instance Advisor
https://aws.amazon.com/ec2/spot/instance-advisor/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Main takeaways for Spot Instances
• Build instance-type agnostic workloads
• No bidding, no price spikes
• New instance families generally have higher interruption rates – Spot Instance Advisor
• Architect for fault-tolerance to be Spot ready
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EMR: Enterprise-grade Hadoop & Spark
Scale to any size
• Scale compute (EMR) &
storage (S3) independently
• Store, and process any
amount of data—PB to EBs
• Provision one, hundreds,
or thousands of nodes
• Auto-scaling
Data Lake
on AWS
Amazon EMR
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Enterprise-grade Hadoop & Spark
Highly available and durable
• S3 is designed to deliver 99.999999999% durability
• EMR monitors your cluster—replacing poorly performing
& failed nodes, and restarting services
• Monitor your clusters using Amazon CloudWatch
• Built-in console to view job history & browse logs
• EMR has on-cluster HDFS for data persistence
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Enterprise-grade Hadoop & Spark
Highly secure
• Encryption of data at rest and in-transit
• ML-powered security with Amazon Macie
• Network isolation using Amazon VPC
• Access and permissions control with IAM policies
• Log, and audit activity with AWS CloudTrail
• Microsoft AD integration with Kerberos support
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon EMR node types
Master node: The node that manages the
cluster. The master node tracks the status of
tasks and monitors the health of the cluster.
Core nodes: The node that runs tasks and
stores data in the Hadoop Distributed File
System (HDFS) on your cluster.
Task nodes: The node that only runs tasks and
does not store data in HDFS. Task nodes are
optional.
Master instance fleet
HDFS HDFS
Amazon EMR cluster
Task instance fleetCore instance fleet
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
# Parallelized
nodes
Time
# Parallelized
nodes
Time
Job running time: 1 hourJob running time: 10 hours
Parallelization
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Breaking the monolith
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reducing shuffle
10x longer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reducing shuffle | Explode  Group
user_id visits_array
2121123 ["28/01/2018, ”29/01/2018”, "01/01/2019”]
2323434 [ "01/11/2017”, "01/12/2017”]
9959594 [ "01/01/2017”, "02/01/2017”, "03/01/2017”,
"04/01/2017”, "05/01/2017”, "06/01/2017”]
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reducing shuffle | Explode  Group
# user_id visits_array
1 2121123 28/01/2018
2 2121123 29/01/2018
3 2121123 01/01/2019
4 2323434 01/11/2017
5 2323434 01/12/2017
6 9959594 01/01/2017
7 9959594 02/01/2017
8 9959594 03/01/2017
9 9959594 04/01/2017
10 9959594 05/01/2017
11 9959594 06/01/2017
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reducing shuffle | Explode  Group
val countVisitsUDF = (array: Seq[String]) => {
array.length
}
spark.udf.register("countVisits", countVisitsUDF )
spark.sql("""SELECT user_id, countVisits(arr)
FROM tab""").show
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reducing shuffle | Explode  Group
spark.sql("SELECT user_id,
sum(aggregate(arr, 0, (acc, x) -> acc +1)) summary
FROM tab
GROUP BY user_id").show
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reducing shuffle | Explode  Group
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sizing Executors
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sizing Executors Example
spark-submit --executor-cores 15 --executor-memory 90G
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sizing Executors Example
spark-submit --executor-cores 15 --executor-memory 90G
Cores Memory (GB)
15 90
2 12
3 18
4 24
5 30
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Specify target capacity as a mix of instance types and families (up to 5)
• Amazon EMR will attempt to fulfill capacity from the most suitable pools
• Amazon EMR automatically replaces interrupted or failed instances with
one of the instance types that you specified
EMR Instance Fleets
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMR Instance Fleets: Choosing instances
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Spot Instances with a specified uninterrupted duration (1-6 hours)
• Ideal for jobs that take a known time to complete and must meet an SLA
• Lower discount
Instance Fleets: Spot Block
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Running Spark Jobs on Amazon
EMR With Spot Instances
Eyal Lanxner
CTO and Co-Founder
38
Anatoli Atamanov
VP Operations & IT
Big
Data
Brand
Optimization
Marketplace
Intelligence
Pricing
Optimization
Advertising
Optimization
Machine-
Learning
Sellers Retailers Brands
Technology
Solutions
Customers
Problem Complexity (Example):
Pricing Optimization
C o n s i d e r a t i o n s S p e e d
 Cost structure
 Marketplace fees
 Product attributes & rating
 Product goals & constraints
 Competing listings & products
 Competitive pricing
 Marketplace ranking
 Orders & sales
 …
Data Sizing
Datalake on S3: 927 Tb
2.5 Tb
2.5 Tb
2.5 Tb
Old EMR Architecture
Application Platform Scheduler
24/7 running cluster:
MASTER x1 (On-Demand):
- m4.xlarge
CORE x10 (On-Demand):
- m4.4xlarge
S3 Datalake
New EMR Architecture
Apache Airflow
Transient dedicated clusters:
MASTER x1 (On-Demand)
• m4.xlarge
CORE x10 (Spot Instances, EMR Instance Fleets)
• m4.4xlarge
• r4.4xlarge
• r3.4xlarge
S3 Datalake
Job 1 Job 2 Job N
EMR Cost Optimization
EMR Cost Structure
Thank You!
Get in touch with us at info@feedvisor.com
46
We are hiring!
Apply on https://feedvisor.com/about/careers/tel-aviv/
eyal.lanxner@feedvisor.com
anatoli.atamanov@feedvisor.com
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ran Sheinberg
Specialist SA – EC2 Spot
Amazon Web Services
Eyal Lanxner
Chief Technology Officer
Feedvisor
Daniel Haviv
Specialist SA - Analytics
Amazon Web Services
Anatoli Atamanov
VP Operations & IT
Feedvisor
Please complete the survey
http://bit.ly/2SAOf2tBlog post
http://bit.ly/EMRSparkSpot

Contenu connexe

Tendances

Tendances (20)

Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
 
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics PlatformsAutomate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
Automate Business Insights on AWS - Simple, Fast, and Secure Analytics Platforms
 
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
 
High-Performance-Computing-on-AWS-and-Industry-Simulation
High-Performance-Computing-on-AWS-and-Industry-SimulationHigh-Performance-Computing-on-AWS-and-Industry-Simulation
High-Performance-Computing-on-AWS-and-Industry-Simulation
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Customer Uses of Data Lakes
Customer Uses of Data LakesCustomer Uses of Data Lakes
Customer Uses of Data Lakes
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
 
High Performance Computing on AWS
High Performance Computing on AWSHigh Performance Computing on AWS
High Performance Computing on AWS
 
Data Lake na área da saúde- AWS
Data Lake na área da saúde- AWSData Lake na área da saúde- AWS
Data Lake na área da saúde- AWS
 
AWS DeepLens Workshop_Build Computer Vision Applications
AWS DeepLens Workshop_Build Computer Vision Applications AWS DeepLens Workshop_Build Computer Vision Applications
AWS DeepLens Workshop_Build Computer Vision Applications
 
What's new in Amazon RDS - ADB206 - New York AWS Summit
What's new in Amazon RDS - ADB206 - New York AWS SummitWhat's new in Amazon RDS - ADB206 - New York AWS Summit
What's new in Amazon RDS - ADB206 - New York AWS Summit
 
Loading Data into Redshift
Loading Data into RedshiftLoading Data into Redshift
Loading Data into Redshift
 
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
 
Where ml ai_heavy
Where ml ai_heavyWhere ml ai_heavy
Where ml ai_heavy
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
How To Deploy Your File Workloads Quickly & Easily with AWS
How To Deploy Your File Workloads Quickly & Easily with AWSHow To Deploy Your File Workloads Quickly & Easily with AWS
How To Deploy Your File Workloads Quickly & Easily with AWS
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter Bot
 

Similaire à Best practices for Running Spark jobs on Amazon EMR with Spot Instances | AWS Summit Tel Aviv 2019

Similaire à Best practices for Running Spark jobs on Amazon EMR with Spot Instances | AWS Summit Tel Aviv 2019 (20)

SRV203 Optimizing Amazon EC2 for Fun and Profit
 SRV203 Optimizing Amazon EC2 for Fun and Profit SRV203 Optimizing Amazon EC2 for Fun and Profit
SRV203 Optimizing Amazon EC2 for Fun and Profit
 
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS SummitOptimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
 
Optimize EC2 for Fun and Profit - SRV203 - Anaheim AWS Summit
Optimize EC2 for Fun and Profit - SRV203 - Anaheim AWS SummitOptimize EC2 for Fun and Profit - SRV203 - Anaheim AWS Summit
Optimize EC2 for Fun and Profit - SRV203 - Anaheim AWS Summit
 
Optimize Amazon EC2 for Fun and Profit
Optimize Amazon EC2 for Fun and Profit Optimize Amazon EC2 for Fun and Profit
Optimize Amazon EC2 for Fun and Profit
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
 
Cost optimisation as a by-product of awesome practice and agility at Trainline
Cost optimisation as a by-product of awesome practice and agility at TrainlineCost optimisation as a by-product of awesome practice and agility at Trainline
Cost optimisation as a by-product of awesome practice and agility at Trainline
 
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
 
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
 
Immersion Day - Estratégias e melhores práticas para ingestão de dados
Immersion Day - Estratégias e melhores práticas para ingestão de dadosImmersion Day - Estratégias e melhores práticas para ingestão de dados
Immersion Day - Estratégias e melhores práticas para ingestão de dados
 
AWS Webinar Series - Cost Optimisation Levers, Tools, and Strategies
AWS Webinar Series - Cost Optimisation Levers, Tools, and StrategiesAWS Webinar Series - Cost Optimisation Levers, Tools, and Strategies
AWS Webinar Series - Cost Optimisation Levers, Tools, and Strategies
 
Amazon EC2 Strategie per l'ottimizzazione dei costi
Amazon EC2 Strategie per l'ottimizzazione dei costiAmazon EC2 Strategie per l'ottimizzazione dei costi
Amazon EC2 Strategie per l'ottimizzazione dei costi
 
Scaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersScaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M Users
 
Best of re:Invent for Startups
Best of re:Invent for StartupsBest of re:Invent for Startups
Best of re:Invent for Startups
 
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
 
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
 
Advanced cost management strategies in AWS
Advanced cost management strategies in AWSAdvanced cost management strategies in AWS
Advanced cost management strategies in AWS
 
Data Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech TalksData Transformation Patterns in AWS - AWS Online Tech Talks
Data Transformation Patterns in AWS - AWS Online Tech Talks
 
High Performance Computing (HPC) on AWS 101
High Performance Computing (HPC) on AWS 101High Performance Computing (HPC) on AWS 101
High Performance Computing (HPC) on AWS 101
 
Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%
 
Amazon EC2 Spot Instances Workshop
Amazon EC2 Spot Instances WorkshopAmazon EC2 Spot Instances Workshop
Amazon EC2 Spot Instances Workshop
 

Plus de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Best practices for Running Spark jobs on Amazon EMR with Spot Instances | AWS Summit Tel Aviv 2019

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Best practices for Running Spark jobs on Amazon EMR with Spot Instances Ran Sheinberg Specialist SA – EC2 Spot Amazon Web Services Eyal Lanxner Chief Technology Officer Feedvisor D A T 3 0 3 Daniel Haviv Specialist SA - Analytics Amazon Web Services Anatoli Atamanov VP Operations & IT Feedvisor
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • Amazon EC2 Spot Instances • Amazon EMR recap • Spark best practices • EMR Instance Fleets with Spot Instances • Customer story - Feedvisor
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 purchase options
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. $0.27 $0.29$0.50 1b 1c1a 8XL $0.30 $0.16$0.214XL $0.07 $0.08$0.082XL $0.05 $0.04$0.04XL $0.01 $0.04$0.01L C4 $1.76 On Demand $0.88 $0.44 $0.22 $0.11 EC2 Spot pools - instance type flexibility Each instance family Each instance size Each Availability Zone (61) In every region (20) Is a separate Spot pool R5 M4 C5 I3 M5d R4 D2 C4 R5d
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Running applications at extreme scale single HPC cluster of 1 million vCPUs
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EC2 Spot integrations Auto Scaling AWS Batch Amazon EMR AWS Data Pipeline Amazon Elastic Container Service AWS CloudFormation Amazon Elastic Container Service for Kubernetes AWS Thinkbox Deadline
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. No Bidding Spot is easy Minimal interruptions <5% Low, Predictable Prices
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. No Bidding Spot is easy
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Pricing Model New smooth pricing November 2017
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Spot Instance Advisor https://aws.amazon.com/ec2/spot/instance-advisor/
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Main takeaways for Spot Instances • Build instance-type agnostic workloads • No bidding, no price spikes • New instance families generally have higher interruption rates – Spot Instance Advisor • Architect for fault-tolerance to be Spot ready
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EMR: Enterprise-grade Hadoop & Spark Scale to any size • Scale compute (EMR) & storage (S3) independently • Store, and process any amount of data—PB to EBs • Provision one, hundreds, or thousands of nodes • Auto-scaling Data Lake on AWS Amazon EMR
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Enterprise-grade Hadoop & Spark Highly available and durable • S3 is designed to deliver 99.999999999% durability • EMR monitors your cluster—replacing poorly performing & failed nodes, and restarting services • Monitor your clusters using Amazon CloudWatch • Built-in console to view job history & browse logs • EMR has on-cluster HDFS for data persistence
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Enterprise-grade Hadoop & Spark Highly secure • Encryption of data at rest and in-transit • ML-powered security with Amazon Macie • Network isolation using Amazon VPC • Access and permissions control with IAM policies • Log, and audit activity with AWS CloudTrail • Microsoft AD integration with Kerberos support
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon EMR node types Master node: The node that manages the cluster. The master node tracks the status of tasks and monitors the health of the cluster. Core nodes: The node that runs tasks and stores data in the Hadoop Distributed File System (HDFS) on your cluster. Task nodes: The node that only runs tasks and does not store data in HDFS. Task nodes are optional. Master instance fleet HDFS HDFS Amazon EMR cluster Task instance fleetCore instance fleet
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. # Parallelized nodes Time # Parallelized nodes Time Job running time: 1 hourJob running time: 10 hours Parallelization
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Breaking the monolith
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reducing shuffle 10x longer
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reducing shuffle | Explode  Group user_id visits_array 2121123 ["28/01/2018, ”29/01/2018”, "01/01/2019”] 2323434 [ "01/11/2017”, "01/12/2017”] 9959594 [ "01/01/2017”, "02/01/2017”, "03/01/2017”, "04/01/2017”, "05/01/2017”, "06/01/2017”]
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reducing shuffle | Explode  Group # user_id visits_array 1 2121123 28/01/2018 2 2121123 29/01/2018 3 2121123 01/01/2019 4 2323434 01/11/2017 5 2323434 01/12/2017 6 9959594 01/01/2017 7 9959594 02/01/2017 8 9959594 03/01/2017 9 9959594 04/01/2017 10 9959594 05/01/2017 11 9959594 06/01/2017
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reducing shuffle | Explode  Group val countVisitsUDF = (array: Seq[String]) => { array.length } spark.udf.register("countVisits", countVisitsUDF ) spark.sql("""SELECT user_id, countVisits(arr) FROM tab""").show
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reducing shuffle | Explode  Group spark.sql("SELECT user_id, sum(aggregate(arr, 0, (acc, x) -> acc +1)) summary FROM tab GROUP BY user_id").show
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reducing shuffle | Explode  Group
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Sizing Executors
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Sizing Executors Example spark-submit --executor-cores 15 --executor-memory 90G
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Sizing Executors Example spark-submit --executor-cores 15 --executor-memory 90G Cores Memory (GB) 15 90 2 12 3 18 4 24 5 30
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. • Specify target capacity as a mix of instance types and families (up to 5) • Amazon EMR will attempt to fulfill capacity from the most suitable pools • Amazon EMR automatically replaces interrupted or failed instances with one of the instance types that you specified EMR Instance Fleets
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. EMR Instance Fleets: Choosing instances
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. • Spot Instances with a specified uninterrupted duration (1-6 hours) • Ideal for jobs that take a known time to complete and must meet an SLA • Lower discount Instance Fleets: Spot Block
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 38. Running Spark Jobs on Amazon EMR With Spot Instances Eyal Lanxner CTO and Co-Founder 38 Anatoli Atamanov VP Operations & IT
  • 40. Problem Complexity (Example): Pricing Optimization C o n s i d e r a t i o n s S p e e d  Cost structure  Marketplace fees  Product attributes & rating  Product goals & constraints  Competing listings & products  Competitive pricing  Marketplace ranking  Orders & sales  …
  • 41. Data Sizing Datalake on S3: 927 Tb 2.5 Tb 2.5 Tb 2.5 Tb
  • 42. Old EMR Architecture Application Platform Scheduler 24/7 running cluster: MASTER x1 (On-Demand): - m4.xlarge CORE x10 (On-Demand): - m4.4xlarge S3 Datalake
  • 43. New EMR Architecture Apache Airflow Transient dedicated clusters: MASTER x1 (On-Demand) • m4.xlarge CORE x10 (Spot Instances, EMR Instance Fleets) • m4.4xlarge • r4.4xlarge • r3.4xlarge S3 Datalake Job 1 Job 2 Job N
  • 46. Thank You! Get in touch with us at info@feedvisor.com 46 We are hiring! Apply on https://feedvisor.com/about/careers/tel-aviv/ eyal.lanxner@feedvisor.com anatoli.atamanov@feedvisor.com
  • 47. Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ran Sheinberg Specialist SA – EC2 Spot Amazon Web Services Eyal Lanxner Chief Technology Officer Feedvisor Daniel Haviv Specialist SA - Analytics Amazon Web Services Anatoli Atamanov VP Operations & IT Feedvisor Please complete the survey http://bit.ly/2SAOf2tBlog post http://bit.ly/EMRSparkSpot