Several Nines Cluster Control.pptx

Our guest speaker will be Riaan Nolan of Foodpanda/Hellofood,
Rocket Internet’s global online food delivery marketplace, operating in over 40
countries.

● eCommerce infrastructure challenges - AIA Case Study (i)
● Provisioning highly available environments across multi-server and multi-AZs
● Building and maintaining configuration management systems such as Puppet
● Enabling self-service infrastructure services to internal dev teams
● Health and performance monitoring
● Elastic scaling & Automating failure handling
● Disaster recovery
http://www.severalnines.com/sites/default/files/AIA_Case_Study.pdf (i)

eCommerce infrastructure challenges
- AIA Case Study
Before Cluster Control
● Split Brain o.0
- One node started with an empty gcomm:// address, thank you Puppet :)
● Track back your Data.
- Use anything, callcentre logs, emails sends via AWS SES to recreate your Data.
● Disasters are expensive in time & money for EVERYONE!
- Everything was gone, time, orders, changes on staging, new products, QA tests, the whole kit.
● Be agile and move quickly NOW!
- Restore to a trusted backup point, isolate the faulty data, extract everything you can, product SKUs etc.
What have I learned
Workflow Before Cluster Control

After Cluster Control
eCommerce infrastructure challenges
- AIA Case Study
● Look for help!
● If 20% sacrifice can fix 100% of your problems, go!
- Puppet does not have to control Cluster Control 100% and why should it?
- Getting a commercial license VS. actually hiring a DBA that knows Puppet & Galera.
● Fix bugs and refine your processes.
- Experience is never a bad thing, and it’s how you play the cards that you were dealt.
- Re-Create Disasters in a Sandbox
What have I learned
Workflow After Cluster Control

Provisioning highly available environments
across multi-server and multi-AZs
http://aws.amazon.com/cloudformation
http://aws.amazon.com/cloudformation/aws-cloudformation-templates
https://docs.puppetlabs.com
https://forge.puppetlabs.com
HIERA &
FACTER
Puppet
Manifests
Hardware
Stack

Building and maintaining configuration
management systems such as Puppet
HIERA &
FACTER
Puppet
Manifests
Hardware
Stack
https://help.github.com
https://raw.githubusercontent.com/nerdgirl/git-cheatsheet-visual/master/gitcheatsheet.png (Wallpaper)
https://docs.puppetlabs.com
https://forge.puppetlabs.com

Enabling self-service infrastructure services to
internal dev teams
WTF!

Enabling self-service infrastructure services to
internal dev teams - Seriously o.0
Create End-Points in your Workflow
● Devs and PMs are your most demanding customers. SALE NOW ON!
- And they are also your most prized possession. Keep them happy!
● They want everything protected with little holes.
- They will ask you to Htaccess protect staging, but let their requests from
the Payment Gateways through without Htaccess.
● Keeping them happy, could be as simple as an SSH tunnel >>>
- In your SSH config add something like this:
LocalForward 33306 MyRDSInstanceReplica.mydomain.com:3306
● A Vagrant or Docker Environment.
- Use your existing Puppet code to spin up Docker or Vagrant instances.
● Create Environments for everyone to play in.
- So what if you already have staging, create dev, dev2, qa, beta whatever they need.
● Version Control
- Keep everything under Version Control, Duh!
THE END.

Health and performance monitoring
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatch.html
http://www.elasticsearch.org/overview/kibana
http://newrelic.com
https://www.icinga.org
http://www.severalnines.com/clustercontrol

Elastic scaling & Automating failure handling
"MyFleetAutoScalingGroup" : {
"Type" : "AWS::AutoScaling::AutoScalingGroup",
"Properties" : {
"AvailabilityZones": { "Fn::GetAZs": { "Ref": "AWS::Region" } },
“LaunchConfigurationName" : { "Ref" : "MyFleetLaunchConfig" },
"MinSize" : { "Ref" : "InstanceCountMin" },
"MaxSize" : { "Ref" : "InstanceCountMax" },
"DesiredCapacity" : { "Ref" : "InstanceCountDesired" },
"LoadBalancerNames" : [
{ "Ref" : "ProductionPublicElasticLoadBalancer" },
{ "Ref" : "StagingPublicElasticLoadBalancer"}
],
}
}
"MyFleetPublicElasticLoadBalancer" : {
"Type" : "AWS::ElasticLoadBalancing::LoadBalancer",
"Properties" : {
"CrossZone" : true,
}
}
"MyFleetLaunchConfig" : {
"Type" : "AWS::AutoScaling::LaunchConfiguration",
"Properties" : {
"UserData" : {
"Fn::Base64" : {
"Fn::Join" : ["", [
"#!/bin/bashn",
“# e.g fleetserver-1104eb3a.mydomain.comn”,
"HOSTNAME="fleetserver-$(curl -s http://169.254.169.254/latest/meta-data/instance-id |
cut -d '-' -f2).mydomain.com"n",
"echo "${HOSTNAME}" > /etc/hostnamen",
"hostname "${HOSTNAME}"n",
"apt-get updaten",
"apt-get install -y puppet knockdn”,
"knock puppet.mydomain.com 7777 3333n",
"puppet agent -tv --server puppet.mydomain.com --waitforcert 300 --configtimeout 300n"
] ]
}
}
}
}
node /^fleetserver/ {
class { '::myfleet': }
}
service { ‘myservice’:
ensure => running,
}

Disaster recovery
"MyRDSInstance" : {
"Type": "AWS::RDS::DBInstance",
"Properties": {
"MultiAZ" : “true”,
},
"DeletionPolicy": "Snapshot"
}
"Conditions" : {
"CreateReadReplica" : { "Fn::Equals" : [{
"Ref" : "RDSReadReplica"}, "true" ]}
}
"MyRDSInstanceReplica": {
"Type": "AWS::RDS::DBInstance",
"Condition" : "CreateReadReplica",
"Properties": {
"SourceDBInstanceIdentifier": {
"Ref": "MyRDSInstance"
},
}
}

Re-cap: verb: rēˈkap/ - state again as a summary; recapitulate. "a way of recapping the story"
● Disasters are expensive in time & money for EVERYONE!
- Everything was gone, time, orders, changes on staging, new products, QA tests, the whole kit.
● Be agile and move quickly NOW!
- Restore to a trusted backup point, isolate the faulty data, extract everything you can, product SKUs etc. Never import bad data!
● If 20% sacrifice can fix 100% of your problems, go! Go Now!
- Puppet does not have to control Cluster Control 100% and why should it?
- Getting a commercial license VS. actually hiring a DBA that knows Puppet & Galera
● Log everything. Seriously EVERYTHING!
● Devs and PMs are your most demanding customers. SALE NOW ON!
- And they are also your most prized possession. Keep them happy!
● Use Puppet and, if you can, Cloudformation, everything must be in code! Living the dream .. <3
- If your infrastructure, applications and configurations is in code, you are only 1 commit away from a fix.
● Backup all the things One must backup xXx
- Backup, BACKUP, BAAACCCKKKUUUPPP!!!
THANK YOU!

Several Nines Cluster Control.pptx

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (20)

Several Nines Cluster Control.pptx