SlideShare une entreprise Scribd logo
1  sur  24
Scaling horizontally on AWS
Bozhidar Bozhanov, LogSentinel
About me
• Senior software engineer and architect
• Founder & CEO @ LogSentinel
• Blog: techblog.bozho.net
• Twitter: @bozhobg
• Stackoverflow top 50
Why?
• Why high availability?
• Why scalability?
• To account for increased load
• If you have decent HA, you’re likely scalable
• Don’t overdesign
• Why AWS (or any cloud provider)?
AWS
• IaaS (Infrastructure as a service) originally (EC2)
• Virtual machines
• Load balancers
• Security groups
• PaaS services ontop
• Multiple regions – US, EU, Asia, etc.
• Each region has multiple availability zones (roughly equal to “data centers”)
• Cross-availability zone is easy
• Cross-region is harder
• Similar to Azure, Google Cloud, etc.
Rule of thumb: stateless applications
• No persistent state on the application nodes
• Caches and temporary files are okay
• Distributed vs local cache
• Session state: distributed vs no session state (e.g. JWT)
• Makes the application layer horizontally scalable
• Application nodes are disposable
Executing only once in a cluster
• Sometimes you need to execute a scheduled piece of code only once in a cluster
• Database-backed schedule job management
• Distributed locks (Hazelcast)
• Using queues (SQS, AmazonMQ, RabbitMQ)?
Scaling
• Autoscaling groups
• Groups of virtual machines (instances) with identical configuration
• Scale-up - configure criteria for launching new virtual machines – e.g. “more than 5
minutes of CPU utilization over 80%”
• Scale-down – configure criteria for destroying virtual machines
• Allows for handling spikes, or gradual increase of load
• Spot instances
• Cheap instances you “bid” for. Can be reclaimed at any time
• Useful for heavy background processes.
• Useful for test environments.
Data stores
• Managed
• RDBMS (AWS RDS) – MySQL, MariaDB, Postgres, Oracle, MS SQL
• Search engines – Elasticsearch
• Caches – Elasticache (Redis and memcached)
• Custom:
• Amazon Aurora
• CloudSearch
• S3, SimpleDB, Dynamo
• Own installation: spin VMs, install anything you like (e.g. Cassandra, Hbase, own
Postgres, own Elasticsearch, own caching solution)
Scaling data stores
• The custom ones are automatically scaled (S3, SimpleDB)
• The managed ones are scaled by configuration
• Own deployments are scaled via auto-scaling groups
• Data sharding vs replication with consistent hashing
• Resharding is not trivial
• Replication with consistent hashing can handle scaling up automatically *
Elastic load balancer
• AWS-provided software load balancer
• Points to specified target machines or group of machines (roughly ASGs)
• Configurable: protocols, ports, healthcheck, monitoring metrics
• TLS termination
• AWS-managed certificates
• Load balancer in front of application nodes
• Load balancer in front of data store nodes
• vs application-level load-balancing (configuration vs fetching db nodes dynamically)
Things to automate
• Hardware and network resources (CloudFormation)
• Application and database configuration (OpsWorks: Puppet, Chef, S3+bash, Capistrano)
• Instances
• launch configurations + bash
• docker containers + bash (Elastic Container Service vs Fargate, Kubernetes)
• Why automate?
• because autoscaling benefits from automated instance creation
Scripted stacks
• You can create all instances, load balancers, auto-scaling groups, launch configurations,
security groups, domains, elasticsearch domains, etc., etc.. manually
• But CloudFormation is way better
• JSON or YAML
• CloudFormation manages upgrade
• Stack parameters (instance types, number of nodes, domains used, s3 buckets, etc.)
"DatabaseLaunchConfiguration": {
"Type": "AWS::AutoScaling::LaunchConfiguration",
"Properties": {
"AssociatePublicIpAddress": true,
"IamInstanceProfile": {
"Ref": "InstanceRoleInstanceProfile"
},
"ImageId": {
"Fn::FindInMap": [
{
"Ref": "DatabaseStorageType"
},
{
"Ref": "AWS::Region"
},
"Linux"
]
},
"InstanceType": {
"Ref": "DatabaseInstanceType"
},
"SecurityGroups": [
{
"Ref": "DatabaseSecurityGroup"
}
]
}
"WebAppLoadBalancer": {
"Type": "AWS::ElasticLoadBalancingV2::LoadBalancer",
"Properties": {
"Scheme": "internet-facing",
"Type": "application",
"Subnets": [
{
"Ref": "PublicSubnetA"
},
{
"Ref": "PublicSubnetB"
},
{
"Ref": "PublicSubnetC"
}
],
"SecurityGroups": [
{
"Ref": "WebAppLoadBalancerSecurityGroup"
}
]
}
},
"WebAppTargetGroup": {
"Type": "AWS::ElasticLoadBalancingV2::TargetGroup",
"Properties": {
"HealthCheckIntervalSeconds": 30,
"HealthCheckProtocol": "HTTP",
"HealthCheckTimeoutSeconds": 10,
"HealthyThresholdCount": 2,
"HealthCheckPath": "/healthcheck",
"Matcher": {
"HttpCode": "200"
},
"Port": 8080,
"Protocol": "HTTP",
"TargetGroupAttributes": [
{
"Key":
"deregistration_delay.timeout_seconds",
"Value": "20"
}
],
"UnhealthyThresholdCount": 3,
"VpcId": {
"Ref": "VPC"
}
}
},
Why CloudFormation?
• Replicable stacks
• Used for different customers
• Used for different environments
• Used for disaster recovery
• Having a clear documentation of your entire infrastructure
• DevOps friendly
• Not that hard to learn
• Drawbacks: slow change-and-test cycles, proprietary
• Alternatives: Terraform
• Tries to abstract stack creation independent of provider, but you still depend on
proprietary concepts like ELB, security groups, etc.
Configuration provisioning
• OpsWorks – hosted Puppet or Chef
• Capistrano – “login to all machines and do x, y, z”
• S3 – simple, no learning curve
• Instance launch configuration includes files to fetch from S3 (app.properties,
db.properties, cassandra.conf, mysql.conf, etc.)
• CloudFormation can write dynamic values to conf files (e.g. ELB address)
"UserData": {
"Fn::Base64": {
"Fn::Join": [
"",
[
"#!/bin/bash -xn",
"yum update -y aws-cfn-bootstrapn",
"yum install -y aws-clin",
"cat <<EOF >> /var/app/app.propertiesn",
{
"Fn::Join": [
"",
[
"n",
“db.host=",
{
"Ref": "DatabaseELBAddress"
},
"n",
"elasticsearch.url=https://",
{
"Ref": "ElasticSearchDomainName"
},
"n",
"root.url=https://",
{
"Ref": "DomainName"
]
]
},
"EOF“
Automated instance setup
• Elastic Container Services
• Deploy docker containers on EC2 instances
• Fargate abstracts the need to manage the underlying EC2 instance
• Kubernetes – vendor-independent
• But don’t rush into using kubernetes (or Docker for that matter).
• Packer – creates images
• Manual
• Launch configuration to fetch and execute setup.sh
• Allows for easy zero downtime blue-green deployment
• Instance setup changed? Destroy the it and launch a new one
• Simple. Simple is good.
Blue-green deployment
• Two S3 “folders” – blue and green
• Shared database
• Two autoscaling groups – blue (currently active) and green (currently passive)
• Upload new release artifact (e.g. fat jar) to s3://setup-bucket/green
• Activate the green ASG (increase required number of instances)
• Wait for nodes to launch
• Execute acceptance tests
• Switch DNS record (Route53) from blue ELB to green ELB
• Turquoise (intermediate deployment in case of breaking database changes)
• Can be automated via script that uses AWS CLI or APIs
Other useful services
• IAM – user and role management (each instance knows its role, no need for passwords)
• S3 – distributed storage / key-value store / universally applicable
• CloudTrail – audit trail of all infrastructure changes
• CloudWatch – monitoring of resources
• KMS – key management
• Glacier – cold storage
• Lambda – “serverless” a.k.a. function execution
General best practices
• Security groups
• Only open ports that you need
• Bastion host – entry point to the stack via SSH
• VPC (virtual private cloud)
• your own virtual network, private address space, subnets (per e.g. availability zone),
etc.
• Multi-factor authentication
Conclusion
• Scalability is a function of your application first and infrastructure second
• AWS is pretty straightforward to learn
• You can have scalable, scripted infrastructure without big investments
• New services appear often – check them out
• Vendor lock-in is almost inevitable
• But concepts are (almost) identical across cloud providers
• If something can be done easily without an AWS-specific service, prefer that
• Bash is inevitable
Thank you!

Contenu connexe

Tendances

AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...QCloudMentor
 
Data Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
Data Warehousing in the Era of Big Data: Deep Dive into Amazon RedshiftData Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
Data Warehousing in the Era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
 
Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Fernando Mejía
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationPatrick Di Loreto
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we areMarco Parenzan
 
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...Amazon Web Services
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL AzureShy Engelberg
 
Zero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DBZero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DBAdnan Hashmi
 
Building Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CSBuilding Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CSJohn Burwell
 
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...QCloudMentor
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudCAMMS
 
CosmosDb for beginners
CosmosDb for beginnersCosmosDb for beginners
CosmosDb for beginnersPhil Pursglove
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...DataStax Academy
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]Rainforest QA
 
Cool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDBCool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDBJan Hentschel
 
Session 1 IaaS, PaaS, SaaS Overview
Session 1   IaaS, PaaS, SaaS OverviewSession 1   IaaS, PaaS, SaaS Overview
Session 1 IaaS, PaaS, SaaS OverviewCode Mastery
 

Tendances (20)

AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
AWS Study Group - Chapter 09 - Storage Option [Solution Architect Associate G...
 
Data Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
Data Warehousing in the Era of Big Data: Deep Dive into Amazon RedshiftData Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
Data Warehousing in the Era of Big Data: Deep Dive into Amazon Redshift
 
Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017Cloud Computing101 Azure, updated june 2017
Cloud Computing101 Azure, updated june 2017
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
 
Azure CosmosDb
Azure CosmosDbAzure CosmosDb
Azure CosmosDb
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we are
 
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...
(BDT308) Using Amazon Elastic MapReduce as Your Scalable Data Warehouse | AWS...
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL Azure
 
Scalding @ Coursera
Scalding @ CourseraScalding @ Coursera
Scalding @ Coursera
 
Zero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DBZero to 60 with Azure Cosmos DB
Zero to 60 with Azure Cosmos DB
 
Building Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CSBuilding Complete Private Clouds with Apache CloudStack and Riak CS
Building Complete Private Clouds with Apache CloudStack and Riak CS
 
Azure DocumentDB
Azure DocumentDBAzure DocumentDB
Azure DocumentDB
 
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
AWS Study Group - Chapter 07 - Integrating Application Services [Solution Arc...
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in Cloud
 
Azure CosmosDB
Azure CosmosDBAzure CosmosDB
Azure CosmosDB
 
CosmosDb for beginners
CosmosDb for beginnersCosmosDb for beginners
CosmosDb for beginners
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
 
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
 
Cool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDBCool NoSQL on Azure with DocumentDB
Cool NoSQL on Azure with DocumentDB
 
Session 1 IaaS, PaaS, SaaS Overview
Session 1   IaaS, PaaS, SaaS OverviewSession 1   IaaS, PaaS, SaaS Overview
Session 1 IaaS, PaaS, SaaS Overview
 

Similaire à Scaling horizontally on AWS

Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWSTom Laszewski
 
AWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the CloudAWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the CloudAmazon Web Services
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAmazon Web Services
 
AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018Bert Zahniser
 
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...Amazon Web Services
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineDaniel N
 
Brief theoretical overview on AWS Components
Brief theoretical overview on AWS ComponentsBrief theoretical overview on AWS Components
Brief theoretical overview on AWS ComponentsTech Tutorials
 
AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...Amazon Web Services
 
What are clouds made from
What are clouds made fromWhat are clouds made from
What are clouds made fromJohn Garbutt
 
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheTraining AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheBùi Quang Lâm
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Tom Laszewski
 
AWS Webcast - Build Agile Applications in AWS Cloud
AWS Webcast - Build Agile Applications in AWS CloudAWS Webcast - Build Agile Applications in AWS Cloud
AWS Webcast - Build Agile Applications in AWS CloudAmazon Web Services
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesOwen Cutajar
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSKristana Kane
 
Utah Codecamp Cloud Computing
Utah Codecamp Cloud ComputingUtah Codecamp Cloud Computing
Utah Codecamp Cloud ComputingTom Creighton
 
Deep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech TalksDeep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech TalksAmazon Web Services
 

Similaire à Scaling horizontally on AWS (20)

Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWSMigrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
 
AWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the CloudAWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the Cloud
 
Aws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled AppsAws for Startups Building Cloud Enabled Apps
Aws for Startups Building Cloud Enabled Apps
 
AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018AWS Database Services-Philadelphia AWS User Group-4-17-2018
AWS Database Services-Philadelphia AWS User Group-4-17-2018
 
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
AWS Webcast - AWS Webinar Series for Education #3 - Discover the Ease of AWS ...
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
 
Brief theoretical overview on AWS Components
Brief theoretical overview on AWS ComponentsBrief theoretical overview on AWS Components
Brief theoretical overview on AWS Components
 
AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...
AWS Webcast - Webinar Series for State and Local Government #3: Discover the ...
 
What are clouds made from
What are clouds made fromWhat are clouds made from
What are clouds made from
 
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheTraining AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCache
 
Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS Migrating enterprise workloads to AWS
Migrating enterprise workloads to AWS
 
AWS Webcast - Build Agile Applications in AWS Cloud
AWS Webcast - Build Agile Applications in AWS CloudAWS Webcast - Build Agile Applications in AWS Cloud
AWS Webcast - Build Agile Applications in AWS Cloud
 
TechBeats #2
TechBeats #2TechBeats #2
TechBeats #2
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
AWS Webcast - Website Hosting
AWS Webcast - Website HostingAWS Webcast - Website Hosting
AWS Webcast - Website Hosting
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & Opportunities
 
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWSMigrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
Migrating Your Databases to AWS Deep Dive on Amazon RDS and AWS
 
Utah Codecamp Cloud Computing
Utah Codecamp Cloud ComputingUtah Codecamp Cloud Computing
Utah Codecamp Cloud Computing
 
AWS Distilled
AWS DistilledAWS Distilled
AWS Distilled
 
Deep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech TalksDeep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
Deep Dive on AWS Lambda - January 2017 AWS Online Tech Talks
 

Plus de Bozhidar Bozhanov

Антикорупционен софтуер
Антикорупционен софтуерАнтикорупционен софтуер
Антикорупционен софтуерBozhidar Bozhanov
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyBozhidar Bozhanov
 
Encryption in the enterprise
Encryption in the enterpriseEncryption in the enterprise
Encryption in the enterpriseBozhidar Bozhanov
 
Blockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabiltyBlockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabiltyBozhidar Bozhanov
 
Електронна държава
Електронна държаваЕлектронна държава
Електронна държаваBozhidar Bozhanov
 
Blockchain - what is it good for?
Blockchain - what is it good for?Blockchain - what is it good for?
Blockchain - what is it good for?Bozhidar Bozhanov
 
Algorithmic and technological transparency
Algorithmic and technological transparencyAlgorithmic and technological transparency
Algorithmic and technological transparencyBozhidar Bozhanov
 
Alternatives for copyright protection online
Alternatives for copyright protection onlineAlternatives for copyright protection online
Alternatives for copyright protection onlineBozhidar Bozhanov
 
Политики, основани на данни
Политики, основани на данниПолитики, основани на данни
Политики, основани на данниBozhidar Bozhanov
 
Отворено законодателство
Отворено законодателствоОтворено законодателство
Отворено законодателствоBozhidar Bozhanov
 
Electronic governance steps in the right direction?
Electronic governance   steps in the right direction?Electronic governance   steps in the right direction?
Electronic governance steps in the right direction?Bozhidar Bozhanov
 
Сигурност на електронното управление
Сигурност на електронното управлениеСигурност на електронното управление
Сигурност на електронното управлениеBozhidar Bozhanov
 
Биометрична идентификация
Биометрична идентификацияБиометрична идентификация
Биометрична идентификацияBozhidar Bozhanov
 
Регулации и технологии
Регулации и технологииРегулации и технологии
Регулации и технологииBozhidar Bozhanov
 

Plus de Bozhidar Bozhanov (20)

Антикорупционен софтуер
Антикорупционен софтуерАнтикорупционен софтуер
Антикорупционен софтуер
 
Nothing is secure.pdf
Nothing is secure.pdfNothing is secure.pdf
Nothing is secure.pdf
 
Elasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and MultitenancyElasticsearch - Scalability and Multitenancy
Elasticsearch - Scalability and Multitenancy
 
Encryption in the enterprise
Encryption in the enterpriseEncryption in the enterprise
Encryption in the enterprise
 
Blockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabiltyBlockchain overview - types, use-cases, security and usabilty
Blockchain overview - types, use-cases, security and usabilty
 
Електронна държава
Електронна държаваЕлектронна държава
Електронна държава
 
Blockchain - what is it good for?
Blockchain - what is it good for?Blockchain - what is it good for?
Blockchain - what is it good for?
 
Algorithmic and technological transparency
Algorithmic and technological transparencyAlgorithmic and technological transparency
Algorithmic and technological transparency
 
Alternatives for copyright protection online
Alternatives for copyright protection onlineAlternatives for copyright protection online
Alternatives for copyright protection online
 
GDPR for developers
GDPR for developersGDPR for developers
GDPR for developers
 
Политики, основани на данни
Политики, основани на данниПолитики, основани на данни
Политики, основани на данни
 
Отворено законодателство
Отворено законодателствоОтворено законодателство
Отворено законодателство
 
Overview of Message Queues
Overview of Message QueuesOverview of Message Queues
Overview of Message Queues
 
Electronic governance steps in the right direction?
Electronic governance   steps in the right direction?Electronic governance   steps in the right direction?
Electronic governance steps in the right direction?
 
Сигурност на електронното управление
Сигурност на електронното управлениеСигурност на електронното управление
Сигурност на електронното управление
 
Opensource government
Opensource governmentOpensource government
Opensource government
 
Биометрична идентификация
Биометрична идентификацияБиометрична идентификация
Биометрична идентификация
 
Biometric identification
Biometric identificationBiometric identification
Biometric identification
 
Регулации и технологии
Регулации и технологииРегулации и технологии
Регулации и технологии
 
Regulations and technology
Regulations and technologyRegulations and technology
Regulations and technology
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Dernier (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Scaling horizontally on AWS

  • 1. Scaling horizontally on AWS Bozhidar Bozhanov, LogSentinel
  • 2. About me • Senior software engineer and architect • Founder & CEO @ LogSentinel • Blog: techblog.bozho.net • Twitter: @bozhobg • Stackoverflow top 50
  • 3. Why? • Why high availability? • Why scalability? • To account for increased load • If you have decent HA, you’re likely scalable • Don’t overdesign • Why AWS (or any cloud provider)?
  • 4. AWS • IaaS (Infrastructure as a service) originally (EC2) • Virtual machines • Load balancers • Security groups • PaaS services ontop • Multiple regions – US, EU, Asia, etc. • Each region has multiple availability zones (roughly equal to “data centers”) • Cross-availability zone is easy • Cross-region is harder • Similar to Azure, Google Cloud, etc.
  • 5. Rule of thumb: stateless applications • No persistent state on the application nodes • Caches and temporary files are okay • Distributed vs local cache • Session state: distributed vs no session state (e.g. JWT) • Makes the application layer horizontally scalable • Application nodes are disposable
  • 6. Executing only once in a cluster • Sometimes you need to execute a scheduled piece of code only once in a cluster • Database-backed schedule job management • Distributed locks (Hazelcast) • Using queues (SQS, AmazonMQ, RabbitMQ)?
  • 7. Scaling • Autoscaling groups • Groups of virtual machines (instances) with identical configuration • Scale-up - configure criteria for launching new virtual machines – e.g. “more than 5 minutes of CPU utilization over 80%” • Scale-down – configure criteria for destroying virtual machines • Allows for handling spikes, or gradual increase of load • Spot instances • Cheap instances you “bid” for. Can be reclaimed at any time • Useful for heavy background processes. • Useful for test environments.
  • 8. Data stores • Managed • RDBMS (AWS RDS) – MySQL, MariaDB, Postgres, Oracle, MS SQL • Search engines – Elasticsearch • Caches – Elasticache (Redis and memcached) • Custom: • Amazon Aurora • CloudSearch • S3, SimpleDB, Dynamo • Own installation: spin VMs, install anything you like (e.g. Cassandra, Hbase, own Postgres, own Elasticsearch, own caching solution)
  • 9. Scaling data stores • The custom ones are automatically scaled (S3, SimpleDB) • The managed ones are scaled by configuration • Own deployments are scaled via auto-scaling groups • Data sharding vs replication with consistent hashing • Resharding is not trivial • Replication with consistent hashing can handle scaling up automatically *
  • 10. Elastic load balancer • AWS-provided software load balancer • Points to specified target machines or group of machines (roughly ASGs) • Configurable: protocols, ports, healthcheck, monitoring metrics • TLS termination • AWS-managed certificates • Load balancer in front of application nodes • Load balancer in front of data store nodes • vs application-level load-balancing (configuration vs fetching db nodes dynamically)
  • 11. Things to automate • Hardware and network resources (CloudFormation) • Application and database configuration (OpsWorks: Puppet, Chef, S3+bash, Capistrano) • Instances • launch configurations + bash • docker containers + bash (Elastic Container Service vs Fargate, Kubernetes) • Why automate? • because autoscaling benefits from automated instance creation
  • 12. Scripted stacks • You can create all instances, load balancers, auto-scaling groups, launch configurations, security groups, domains, elasticsearch domains, etc., etc.. manually • But CloudFormation is way better • JSON or YAML • CloudFormation manages upgrade • Stack parameters (instance types, number of nodes, domains used, s3 buckets, etc.)
  • 13. "DatabaseLaunchConfiguration": { "Type": "AWS::AutoScaling::LaunchConfiguration", "Properties": { "AssociatePublicIpAddress": true, "IamInstanceProfile": { "Ref": "InstanceRoleInstanceProfile" }, "ImageId": { "Fn::FindInMap": [ { "Ref": "DatabaseStorageType" }, { "Ref": "AWS::Region" }, "Linux" ] }, "InstanceType": { "Ref": "DatabaseInstanceType" }, "SecurityGroups": [ { "Ref": "DatabaseSecurityGroup" } ] }
  • 14. "WebAppLoadBalancer": { "Type": "AWS::ElasticLoadBalancingV2::LoadBalancer", "Properties": { "Scheme": "internet-facing", "Type": "application", "Subnets": [ { "Ref": "PublicSubnetA" }, { "Ref": "PublicSubnetB" }, { "Ref": "PublicSubnetC" } ], "SecurityGroups": [ { "Ref": "WebAppLoadBalancerSecurityGroup" } ] } },
  • 15. "WebAppTargetGroup": { "Type": "AWS::ElasticLoadBalancingV2::TargetGroup", "Properties": { "HealthCheckIntervalSeconds": 30, "HealthCheckProtocol": "HTTP", "HealthCheckTimeoutSeconds": 10, "HealthyThresholdCount": 2, "HealthCheckPath": "/healthcheck", "Matcher": { "HttpCode": "200" }, "Port": 8080, "Protocol": "HTTP", "TargetGroupAttributes": [ { "Key": "deregistration_delay.timeout_seconds", "Value": "20" } ], "UnhealthyThresholdCount": 3, "VpcId": { "Ref": "VPC" } } },
  • 16. Why CloudFormation? • Replicable stacks • Used for different customers • Used for different environments • Used for disaster recovery • Having a clear documentation of your entire infrastructure • DevOps friendly • Not that hard to learn • Drawbacks: slow change-and-test cycles, proprietary • Alternatives: Terraform • Tries to abstract stack creation independent of provider, but you still depend on proprietary concepts like ELB, security groups, etc.
  • 17. Configuration provisioning • OpsWorks – hosted Puppet or Chef • Capistrano – “login to all machines and do x, y, z” • S3 – simple, no learning curve • Instance launch configuration includes files to fetch from S3 (app.properties, db.properties, cassandra.conf, mysql.conf, etc.) • CloudFormation can write dynamic values to conf files (e.g. ELB address)
  • 18. "UserData": { "Fn::Base64": { "Fn::Join": [ "", [ "#!/bin/bash -xn", "yum update -y aws-cfn-bootstrapn", "yum install -y aws-clin", "cat <<EOF >> /var/app/app.propertiesn", { "Fn::Join": [ "", [ "n", “db.host=", { "Ref": "DatabaseELBAddress" }, "n", "elasticsearch.url=https://", { "Ref": "ElasticSearchDomainName" }, "n", "root.url=https://", { "Ref": "DomainName" ] ] }, "EOF“
  • 19. Automated instance setup • Elastic Container Services • Deploy docker containers on EC2 instances • Fargate abstracts the need to manage the underlying EC2 instance • Kubernetes – vendor-independent • But don’t rush into using kubernetes (or Docker for that matter). • Packer – creates images • Manual • Launch configuration to fetch and execute setup.sh • Allows for easy zero downtime blue-green deployment • Instance setup changed? Destroy the it and launch a new one • Simple. Simple is good.
  • 20. Blue-green deployment • Two S3 “folders” – blue and green • Shared database • Two autoscaling groups – blue (currently active) and green (currently passive) • Upload new release artifact (e.g. fat jar) to s3://setup-bucket/green • Activate the green ASG (increase required number of instances) • Wait for nodes to launch • Execute acceptance tests • Switch DNS record (Route53) from blue ELB to green ELB • Turquoise (intermediate deployment in case of breaking database changes) • Can be automated via script that uses AWS CLI or APIs
  • 21. Other useful services • IAM – user and role management (each instance knows its role, no need for passwords) • S3 – distributed storage / key-value store / universally applicable • CloudTrail – audit trail of all infrastructure changes • CloudWatch – monitoring of resources • KMS – key management • Glacier – cold storage • Lambda – “serverless” a.k.a. function execution
  • 22. General best practices • Security groups • Only open ports that you need • Bastion host – entry point to the stack via SSH • VPC (virtual private cloud) • your own virtual network, private address space, subnets (per e.g. availability zone), etc. • Multi-factor authentication
  • 23. Conclusion • Scalability is a function of your application first and infrastructure second • AWS is pretty straightforward to learn • You can have scalable, scripted infrastructure without big investments • New services appear often – check them out • Vendor lock-in is almost inevitable • But concepts are (almost) identical across cloud providers • If something can be done easily without an AWS-specific service, prefer that • Bash is inevitable