SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
PetaMongo:
A Petabyte Database for as Little as $200
Chris Biow, MongoDB
Miles Ward, AWS
November 13, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Agenda
• MongoDB on AWS review
– Guidance, Storage, Architecture

• MongoDB at PetaScale on AWS
Tools to simplify your design
• Whitepaper
• AWS Marketplace
• AWS
CloudFormation

http://media.amazonwebservices.com/AWS_NoSQL_MongoDB.pdf
• Easy to start a
single node
• Correctly configured
PIOPS EBS Storage
• No extra cost
https://aws.amazon.com/marketplace/pp/B00COAAEH8/ref=srh_res_product_title?ie=UTF8&sr=0-6&qid=1383897659043
AWS CloudFormation
• Nested Templates
• Nodes and Storage
• Configurable Scale
• AWS CloudFormation: Your
Infrastructure belongs in your
source control
mongodb.org/display/DOCS/Automating+Deployment+with+CloudFormation
AWS Storage Options

EBS
PIOPS

SSD

• Amazon EBS – Provisioned IOPS volumes
•
Deliver predictable, high performance for I/O intensive workloads
•
Specify IOPS required upfront, and EBS provisions for lifetime of volume
– 4000 IOPS per volume, can stripe to get thousands of IOPS to an EC2 instance

• High IO Instances – hi1.4xlarge
•
•

For some applications that require tens of thousands of IOPS
Eliminates network latency/bandwidth as a performance constraint to storage
AWS Storage Options
Testing: random 4k reads
EBS

+

One Volume: ~200 MongoOPS with some variability, <1mb/s
Loaded instance: ~ 1000 MongoOPS with some variability <10mb/s
One Volume: 200
0 MongoOPS with <1% variability, 16mb/s
Loaded Instance: 16,000 MongoOPS with <1% variability, 64mb/s

PIOPS

Loaded Cluster Instance:

SSD

MongoOPS, 320mb/s

Hi1.4xlarge ephemeral: ~64,000 MongoOPS with low variability, ~245mb/s
Testing: random 4k reads

+

PIOPS

Stable

EBS

SSD
Stability Tips
• Ext4 or XFS, nodiratime, noatime
• Raise file descriptor limits
• Set disk read-ahead
• No large virtual memory pages
• SNAPSHOT SNAPSHOT SNAPSHOT
• Retain a PIOPS EBS
node for snapshot
backups
• Snapshots allow crossAZ and cross-region
recovery
• SSD hosts as primary
• Shard for scale
Another option…

244gb cr1.8xlarge
So, about that Petabyte
v.cheap
• Spot Market
• m1.small
• 1024 shards
• 1TB EBS from snapshot
• PowerBench reader
• Aggregation queries

v.fast
•
•
•
•
•
•

Auto Scaling On-Demand
m2.4xlarge
50 shards
20TB PIOPS indexed
PowerBench loader
Aggregation queries
The naming of parts
Amazon Terms
• Provisioned IOPS
• Elastic Compute Cloud
• EC2 Spot Instances
• Auto Scaling groups

Nicks
• PIOPS
• EC2
• Here, Spot!
• ASG
Players
MongoDB
• Document-model,
NoSQL database
• Dev adoption is
STRONG
• MongoDB Inc.
trending toward
zero h/w

• Scale-up with commodity h/w
• Scale-out with sharding
• Scale-around with replication
AWS
•
•
•
•

PIOPS for an IO-hungry client
40% of MongoDB customer usage
90% of MongoDB internal usage
More ports :2701[79] than :[15]521
PB & Chocolate
Differentiators for mutual customers
•
•
•
•
•

Fast time-to-solution
Easy global distribution
Secondary index
Geo, text, security
Fast analytic aggregation
Challenge
Motivation: IWBCI…
•
•
•
•
•

Test scale-out of MongoDB beyond typical
Learn massive scale-out on AWS
Do it as cheaply as possible
Apply customer data
Break the petabarrier
m1.small us-east1 Spot Market
m1.small us-east1d Spot Market
Proposal
Item

Units

Time

Unit Cost

Net Cost

m1.small Spot 1050

3hr

$0.007/hr

$22.05

m1.large

3

48hrs

$0.056/hr

$8.07

S3

1TB

1wk

$95/TB/mo

23.75

EBS

1024 x 1TB

1hr

$100/TB/mo

142.22

S3  EBS

1PB

??

$0/TB

Total

0.00
$196.09

http://calculator.s3.amazonaws.com/G77798SS77SH72
Initial Directions
• Spot Instance requests
– m1.small market, mostly us-east-1 (my zone “d”)
– Net: $0.007 / hour = $7 / hr / K-shard

• Perl
– use Net::Amazon::EC2;
– gaps: parse EC2 command-line API

•
•
•
•

Defer Chef, Puppet, AWS CloudFormation
YCSB
userdata.sh
t1.micro / m1.small / cr1.8xlarge
MongoDB Architecture
• 3x Config Servers
– mongod --configsvr

• Routing
– mongos --configdb a,b,c

• Replica sets (not used)
• Shards
– mongod

• Client load
– java -cp [] com.yahoo.ycsb.Client
PetaMongo: A Petabyte Database for as Little as $200 (BDT307) | AWS re:Invent 2013
Range-based sharding
Hash-based sharding
Process Flow
Spot Instance Request (sir-)

• rejected
• Awaiting evaluation
• Awaiting fulfillment
– Partial
– Launch intervals

• Fulfilled

Instances (i-)
• Requested
• Initializing (i)
• Config running (C)
• MongoS starting (s)
• MongoS running (S)
• MongoD starting (D)
• Failed/slow response (X)
Config
sir-

Sharded
Shard

MongoD
MongoS
Spot Instance Lifecycle
Progress
Scaling Experience
•
•
•
•
•
•

4, 16, 64, 256, 1024
4: minimum magnitude for 3x Config
16: startup variation, process flow
64: full speed ahead!
256: chunk distribution time
1024: market dependence, client wire saturation
Lessons Learned
• Code defensively
• Monitor: MongoDB Mgt Svc, top, iftop,
mongostat
• Avoid sentimental attachment
• Prototype / refactor
• Make the instances do the work
• Mitigate chunk migration
Refactor
•
•
•
•
•

BenchPress YCSB
Auto Scaling groups request-spot-instances
use VM::EC2; Net::Amazon::EC2
gsh monolithic Perl
serf polling
Dee-Luxe
Docs Loaded, 512 shards
1.8E+09
1.6E+09
1.4E+09
1.2E+09
1E+09
800000000
600000000

^ 1X
RAM

400000000
200000000
0
5:16:48

5:45:36

6:14:24

6:43:12

7:12:00

7:40:48
Further Work
•
•
•
•
•

Replication
Self-healing
MongoDB-appropriate benchmarks
Customer data
Self-hosting cluster
Please give us your feedback on this
presentation

BDT307
As a thank you, we will select prize
winners daily for completed surveys!

Contenu connexe

En vedette

Getting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsGetting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsAmazon Web Services
 
Andy Jassy Keynote Sydney Customer Appreciation Day
Andy Jassy Keynote Sydney Customer Appreciation DayAndy Jassy Keynote Sydney Customer Appreciation Day
Andy Jassy Keynote Sydney Customer Appreciation DayAmazon Web Services
 
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012Amazon Web Services
 
Webinar: Delivering Static and Dynamic Content Using CloudFront
Webinar: Delivering Static and Dynamic Content Using CloudFrontWebinar: Delivering Static and Dynamic Content Using CloudFront
Webinar: Delivering Static and Dynamic Content Using CloudFrontAmazon Web Services
 
BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012
BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012
BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012Amazon Web Services
 
AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...
AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...
AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...Amazon Web Services
 
REA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation DayREA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation DayAmazon Web Services
 
AWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAmazon Web Services
 
Relational Databases Redefined with AWS
Relational Databases Redefined with AWSRelational Databases Redefined with AWS
Relational Databases Redefined with AWSAmazon Web Services
 
AWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANA
AWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANAAWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANA
AWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANAAmazon Web Services
 
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...Amazon Web Services
 
Canonical AWS Summit London 2011
Canonical AWS Summit London 2011Canonical AWS Summit London 2011
Canonical AWS Summit London 2011Amazon Web Services
 
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...Amazon Web Services
 

En vedette (20)

Into the Cloud
Into the CloudInto the Cloud
Into the Cloud
 
Getting Started with Real-Time Analytics
Getting Started with Real-Time AnalyticsGetting Started with Real-Time Analytics
Getting Started with Real-Time Analytics
 
0. series overview
0. series overview0. series overview
0. series overview
 
Andy Jassy Keynote Sydney Customer Appreciation Day
Andy Jassy Keynote Sydney Customer Appreciation DayAndy Jassy Keynote Sydney Customer Appreciation Day
Andy Jassy Keynote Sydney Customer Appreciation Day
 
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
RMG204 Optimizing Costs with AWS - AWS re: Invent 2012
 
AWS Blackbelt NINJA Dojo
AWS Blackbelt NINJA DojoAWS Blackbelt NINJA Dojo
AWS Blackbelt NINJA Dojo
 
Go Global Right Now (Yes Now!)
Go Global Right Now (Yes Now!)Go Global Right Now (Yes Now!)
Go Global Right Now (Yes Now!)
 
Webinar: Delivering Static and Dynamic Content Using CloudFront
Webinar: Delivering Static and Dynamic Content Using CloudFrontWebinar: Delivering Static and Dynamic Content Using CloudFront
Webinar: Delivering Static and Dynamic Content Using CloudFront
 
AWS Customer Service - Sonian
AWS Customer Service - Sonian AWS Customer Service - Sonian
AWS Customer Service - Sonian
 
BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012
BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012
BDT305 Transforming Big Data with Spark and Shark - AWS re: Invent 2012
 
AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...
AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...
AWS Webcast - Accelerating Application Performance Using In-Memory Caching in...
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 
REA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation DayREA Sydney Customer Appreciation Day
REA Sydney Customer Appreciation Day
 
AWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDB
 
Relational Databases Redefined with AWS
Relational Databases Redefined with AWSRelational Databases Redefined with AWS
Relational Databases Redefined with AWS
 
Masterclass Live: Amazon EC2
Masterclass Live: Amazon EC2 Masterclass Live: Amazon EC2
Masterclass Live: Amazon EC2
 
AWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANA
AWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANAAWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANA
AWS Partner Webcast - Make Decisions Faster with AWS and SAP on HANA
 
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
 
Canonical AWS Summit London 2011
Canonical AWS Summit London 2011Canonical AWS Summit London 2011
Canonical AWS Summit London 2011
 
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
Time to Science, Time to Results: Accelerating Research with AWS - AWS Sympos...
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Dernier

Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 

Dernier (20)

Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 

PetaMongo: A Petabyte Database for as Little as $200 (BDT307) | AWS re:Invent 2013

  • 1. PetaMongo: A Petabyte Database for as Little as $200 Chris Biow, MongoDB Miles Ward, AWS November 13, 2013 © 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
  • 2. Agenda • MongoDB on AWS review – Guidance, Storage, Architecture • MongoDB at PetaScale on AWS
  • 3. Tools to simplify your design • Whitepaper • AWS Marketplace • AWS CloudFormation http://media.amazonwebservices.com/AWS_NoSQL_MongoDB.pdf
  • 4. • Easy to start a single node • Correctly configured PIOPS EBS Storage • No extra cost https://aws.amazon.com/marketplace/pp/B00COAAEH8/ref=srh_res_product_title?ie=UTF8&sr=0-6&qid=1383897659043
  • 5. AWS CloudFormation • Nested Templates • Nodes and Storage • Configurable Scale • AWS CloudFormation: Your Infrastructure belongs in your source control mongodb.org/display/DOCS/Automating+Deployment+with+CloudFormation
  • 6. AWS Storage Options EBS PIOPS SSD • Amazon EBS – Provisioned IOPS volumes • Deliver predictable, high performance for I/O intensive workloads • Specify IOPS required upfront, and EBS provisions for lifetime of volume – 4000 IOPS per volume, can stripe to get thousands of IOPS to an EC2 instance • High IO Instances – hi1.4xlarge • • For some applications that require tens of thousands of IOPS Eliminates network latency/bandwidth as a performance constraint to storage
  • 7. AWS Storage Options Testing: random 4k reads EBS + One Volume: ~200 MongoOPS with some variability, <1mb/s Loaded instance: ~ 1000 MongoOPS with some variability <10mb/s One Volume: 200 0 MongoOPS with <1% variability, 16mb/s Loaded Instance: 16,000 MongoOPS with <1% variability, 64mb/s PIOPS Loaded Cluster Instance: SSD MongoOPS, 320mb/s Hi1.4xlarge ephemeral: ~64,000 MongoOPS with low variability, ~245mb/s
  • 8. Testing: random 4k reads + PIOPS Stable EBS SSD
  • 9. Stability Tips • Ext4 or XFS, nodiratime, noatime • Raise file descriptor limits • Set disk read-ahead • No large virtual memory pages • SNAPSHOT SNAPSHOT SNAPSHOT
  • 10. • Retain a PIOPS EBS node for snapshot backups • Snapshots allow crossAZ and cross-region recovery • SSD hosts as primary • Shard for scale
  • 12. So, about that Petabyte v.cheap • Spot Market • m1.small • 1024 shards • 1TB EBS from snapshot • PowerBench reader • Aggregation queries v.fast • • • • • • Auto Scaling On-Demand m2.4xlarge 50 shards 20TB PIOPS indexed PowerBench loader Aggregation queries
  • 13. The naming of parts Amazon Terms • Provisioned IOPS • Elastic Compute Cloud • EC2 Spot Instances • Auto Scaling groups Nicks • PIOPS • EC2 • Here, Spot! • ASG
  • 15. MongoDB • Document-model, NoSQL database • Dev adoption is STRONG • MongoDB Inc. trending toward zero h/w • Scale-up with commodity h/w • Scale-out with sharding • Scale-around with replication
  • 16. AWS • • • • PIOPS for an IO-hungry client 40% of MongoDB customer usage 90% of MongoDB internal usage More ports :2701[79] than :[15]521
  • 17. PB & Chocolate Differentiators for mutual customers • • • • • Fast time-to-solution Easy global distribution Secondary index Geo, text, security Fast analytic aggregation
  • 19. Motivation: IWBCI… • • • • • Test scale-out of MongoDB beyond typical Learn massive scale-out on AWS Do it as cheaply as possible Apply customer data Break the petabarrier
  • 22. Proposal Item Units Time Unit Cost Net Cost m1.small Spot 1050 3hr $0.007/hr $22.05 m1.large 3 48hrs $0.056/hr $8.07 S3 1TB 1wk $95/TB/mo 23.75 EBS 1024 x 1TB 1hr $100/TB/mo 142.22 S3  EBS 1PB ?? $0/TB Total 0.00 $196.09 http://calculator.s3.amazonaws.com/G77798SS77SH72
  • 23. Initial Directions • Spot Instance requests – m1.small market, mostly us-east-1 (my zone “d”) – Net: $0.007 / hour = $7 / hr / K-shard • Perl – use Net::Amazon::EC2; – gaps: parse EC2 command-line API • • • • Defer Chef, Puppet, AWS CloudFormation YCSB userdata.sh t1.micro / m1.small / cr1.8xlarge
  • 24. MongoDB Architecture • 3x Config Servers – mongod --configsvr • Routing – mongos --configdb a,b,c • Replica sets (not used) • Shards – mongod • Client load – java -cp [] com.yahoo.ycsb.Client
  • 28. Process Flow Spot Instance Request (sir-) • rejected • Awaiting evaluation • Awaiting fulfillment – Partial – Launch intervals • Fulfilled Instances (i-) • Requested • Initializing (i) • Config running (C) • MongoS starting (s) • MongoS running (S) • MongoD starting (D) • Failed/slow response (X)
  • 31. Scaling Experience • • • • • • 4, 16, 64, 256, 1024 4: minimum magnitude for 3x Config 16: startup variation, process flow 64: full speed ahead! 256: chunk distribution time 1024: market dependence, client wire saturation
  • 32. Lessons Learned • Code defensively • Monitor: MongoDB Mgt Svc, top, iftop, mongostat • Avoid sentimental attachment • Prototype / refactor • Make the instances do the work • Mitigate chunk migration
  • 33. Refactor • • • • • BenchPress YCSB Auto Scaling groups request-spot-instances use VM::EC2; Net::Amazon::EC2 gsh monolithic Perl serf polling
  • 35. Docs Loaded, 512 shards 1.8E+09 1.6E+09 1.4E+09 1.2E+09 1E+09 800000000 600000000 ^ 1X RAM 400000000 200000000 0 5:16:48 5:45:36 6:14:24 6:43:12 7:12:00 7:40:48
  • 37. Please give us your feedback on this presentation BDT307 As a thank you, we will select prize winners daily for completed surveys!