SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Our guest speaker will be Riaan Nolan of Foodpanda/Hellofood,
Rocket Internet’s global online food delivery marketplace, operating in over 40
countries.
● eCommerce infrastructure challenges - AIA Case Study (i)
● Provisioning highly available environments across multi-server and multi-AZs
● Building and maintaining configuration management systems such as Puppet
● Enabling self-service infrastructure services to internal dev teams
● Health and performance monitoring
● Elastic scaling & Automating failure handling
● Disaster recovery
http://www.severalnines.com/sites/default/files/AIA_Case_Study.pdf (i)
eCommerce infrastructure challenges
- AIA Case Study
Before Cluster Control
● Split Brain o.0
- One node started with an empty gcomm:// address, thank you Puppet :)
● Track back your Data.
- Use anything, callcentre logs, emails sends via AWS SES to recreate your Data.
● Disasters are expensive in time & money for EVERYONE!
- Everything was gone, time, orders, changes on staging, new products, QA tests, the whole kit.
● Be agile and move quickly NOW!
- Restore to a trusted backup point, isolate the faulty data, extract everything you can, product SKUs etc.
What have I learned
Workflow Before Cluster Control
After Cluster Control
eCommerce infrastructure challenges
- AIA Case Study
● Look for help!
● If 20% sacrifice can fix 100% of your problems, go!
- Puppet does not have to control Cluster Control 100% and why should it?
- Getting a commercial license VS. actually hiring a DBA that knows Puppet & Galera.
● Fix bugs and refine your processes.
- Experience is never a bad thing, and it’s how you play the cards that you were dealt.
- Re-Create Disasters in a Sandbox
What have I learned
Workflow After Cluster Control
Provisioning highly available environments
across multi-server and multi-AZs
http://aws.amazon.com/cloudformation
http://aws.amazon.com/cloudformation/aws-cloudformation-templates
https://docs.puppetlabs.com
https://forge.puppetlabs.com
HIERA &
FACTER
Puppet
Manifests
Hardware
Stack
Building and maintaining configuration
management systems such as Puppet
HIERA &
FACTER
Puppet
Manifests
Hardware
Stack
https://help.github.com
https://raw.githubusercontent.com/nerdgirl/git-cheatsheet-visual/master/gitcheatsheet.png (Wallpaper)
https://docs.puppetlabs.com
https://forge.puppetlabs.com
Enabling self-service infrastructure services to
internal dev teams
WTF!
Enabling self-service infrastructure services to
internal dev teams - Seriously o.0
Create End-Points in your Workflow
● Devs and PMs are your most demanding customers. SALE NOW ON!
- And they are also your most prized possession. Keep them happy!
● They want everything protected with little holes.
- They will ask you to Htaccess protect staging, but let their requests from
the Payment Gateways through without Htaccess.
● Keeping them happy, could be as simple as an SSH tunnel >>>
- In your SSH config add something like this:
LocalForward 33306 MyRDSInstanceReplica.mydomain.com:3306
● A Vagrant or Docker Environment.
- Use your existing Puppet code to spin up Docker or Vagrant instances.
● Create Environments for everyone to play in.
- So what if you already have staging, create dev, dev2, qa, beta whatever they need.
● Version Control
- Keep everything under Version Control, Duh!
THE END.
Health and performance monitoring
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatch.html
http://www.elasticsearch.org/overview/kibana
http://newrelic.com
https://www.icinga.org
http://www.severalnines.com/clustercontrol
Elastic scaling & Automating failure handling
"MyFleetAutoScalingGroup" : {
"Type" : "AWS::AutoScaling::AutoScalingGroup",
"Properties" : {
"AvailabilityZones": { "Fn::GetAZs": { "Ref": "AWS::Region" } },
“LaunchConfigurationName" : { "Ref" : "MyFleetLaunchConfig" },
"MinSize" : { "Ref" : "InstanceCountMin" },
"MaxSize" : { "Ref" : "InstanceCountMax" },
"DesiredCapacity" : { "Ref" : "InstanceCountDesired" },
"LoadBalancerNames" : [
{ "Ref" : "ProductionPublicElasticLoadBalancer" },
{ "Ref" : "StagingPublicElasticLoadBalancer"}
],
}
}
"MyFleetPublicElasticLoadBalancer" : {
"Type" : "AWS::ElasticLoadBalancing::LoadBalancer",
"Properties" : {
"CrossZone" : true,
}
}
"MyFleetLaunchConfig" : {
"Type" : "AWS::AutoScaling::LaunchConfiguration",
"Properties" : {
"UserData" : {
"Fn::Base64" : {
"Fn::Join" : ["", [
"#!/bin/bashn",
“# e.g fleetserver-1104eb3a.mydomain.comn”,
"HOSTNAME="fleetserver-$(curl -s http://169.254.169.254/latest/meta-data/instance-id |
cut -d '-' -f2).mydomain.com"n",
"echo "${HOSTNAME}" > /etc/hostnamen",
"hostname "${HOSTNAME}"n",
"apt-get updaten",
"apt-get install -y puppet knockdn”,
"knock puppet.mydomain.com 7777 3333n",
"puppet agent -tv --server puppet.mydomain.com --waitforcert 300 --configtimeout 300n"
] ]
}
}
}
}
node /^fleetserver/ {
class { '::myfleet': }
}
service { ‘myservice’:
ensure => running,
}
Disaster recovery
"MyRDSInstance" : {
"Type": "AWS::RDS::DBInstance",
"Properties": {
"MultiAZ" : “true”,
},
"DeletionPolicy": "Snapshot"
}
"Conditions" : {
"CreateReadReplica" : { "Fn::Equals" : [{
"Ref" : "RDSReadReplica"}, "true" ]}
}
"MyRDSInstanceReplica": {
"Type": "AWS::RDS::DBInstance",
"Condition" : "CreateReadReplica",
"Properties": {
"SourceDBInstanceIdentifier": {
"Ref": "MyRDSInstance"
},
}
}
Re-cap: verb: rēˈkap/ - state again as a summary; recapitulate. "a way of recapping the story"
● Disasters are expensive in time & money for EVERYONE!
- Everything was gone, time, orders, changes on staging, new products, QA tests, the whole kit.
● Be agile and move quickly NOW!
- Restore to a trusted backup point, isolate the faulty data, extract everything you can, product SKUs etc. Never import bad data!
● If 20% sacrifice can fix 100% of your problems, go! Go Now!
- Puppet does not have to control Cluster Control 100% and why should it?
- Getting a commercial license VS. actually hiring a DBA that knows Puppet & Galera
● Log everything. Seriously EVERYTHING!
● Devs and PMs are your most demanding customers. SALE NOW ON!
- And they are also your most prized possession. Keep them happy!
● Use Puppet and, if you can, Cloudformation, everything must be in code! Living the dream .. <3
- If your infrastructure, applications and configurations is in code, you are only 1 commit away from a fix.
● Backup all the things One must backup xXx
- Backup, BACKUP, BAAACCCKKKUUUPPP!!!
THANK YOU!
Who is using Cluster Control?
THANK YOU

Contenu connexe

En vedette

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Several Nines Cluster Control.pptx

  • 1. Our guest speaker will be Riaan Nolan of Foodpanda/Hellofood, Rocket Internet’s global online food delivery marketplace, operating in over 40 countries.
  • 2. ● eCommerce infrastructure challenges - AIA Case Study (i) ● Provisioning highly available environments across multi-server and multi-AZs ● Building and maintaining configuration management systems such as Puppet ● Enabling self-service infrastructure services to internal dev teams ● Health and performance monitoring ● Elastic scaling & Automating failure handling ● Disaster recovery http://www.severalnines.com/sites/default/files/AIA_Case_Study.pdf (i)
  • 3. eCommerce infrastructure challenges - AIA Case Study Before Cluster Control ● Split Brain o.0 - One node started with an empty gcomm:// address, thank you Puppet :) ● Track back your Data. - Use anything, callcentre logs, emails sends via AWS SES to recreate your Data. ● Disasters are expensive in time & money for EVERYONE! - Everything was gone, time, orders, changes on staging, new products, QA tests, the whole kit. ● Be agile and move quickly NOW! - Restore to a trusted backup point, isolate the faulty data, extract everything you can, product SKUs etc. What have I learned Workflow Before Cluster Control
  • 4. After Cluster Control eCommerce infrastructure challenges - AIA Case Study ● Look for help! ● If 20% sacrifice can fix 100% of your problems, go! - Puppet does not have to control Cluster Control 100% and why should it? - Getting a commercial license VS. actually hiring a DBA that knows Puppet & Galera. ● Fix bugs and refine your processes. - Experience is never a bad thing, and it’s how you play the cards that you were dealt. - Re-Create Disasters in a Sandbox What have I learned Workflow After Cluster Control
  • 5. Provisioning highly available environments across multi-server and multi-AZs http://aws.amazon.com/cloudformation http://aws.amazon.com/cloudformation/aws-cloudformation-templates https://docs.puppetlabs.com https://forge.puppetlabs.com HIERA & FACTER Puppet Manifests Hardware Stack
  • 6. Building and maintaining configuration management systems such as Puppet HIERA & FACTER Puppet Manifests Hardware Stack https://help.github.com https://raw.githubusercontent.com/nerdgirl/git-cheatsheet-visual/master/gitcheatsheet.png (Wallpaper) https://docs.puppetlabs.com https://forge.puppetlabs.com
  • 7. Enabling self-service infrastructure services to internal dev teams WTF!
  • 8. Enabling self-service infrastructure services to internal dev teams - Seriously o.0 Create End-Points in your Workflow ● Devs and PMs are your most demanding customers. SALE NOW ON! - And they are also your most prized possession. Keep them happy! ● They want everything protected with little holes. - They will ask you to Htaccess protect staging, but let their requests from the Payment Gateways through without Htaccess. ● Keeping them happy, could be as simple as an SSH tunnel >>> - In your SSH config add something like this: LocalForward 33306 MyRDSInstanceReplica.mydomain.com:3306 ● A Vagrant or Docker Environment. - Use your existing Puppet code to spin up Docker or Vagrant instances. ● Create Environments for everyone to play in. - So what if you already have staging, create dev, dev2, qa, beta whatever they need. ● Version Control - Keep everything under Version Control, Duh! THE END.
  • 9. Health and performance monitoring http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatch.html http://www.elasticsearch.org/overview/kibana http://newrelic.com https://www.icinga.org http://www.severalnines.com/clustercontrol
  • 10. Elastic scaling & Automating failure handling "MyFleetAutoScalingGroup" : { "Type" : "AWS::AutoScaling::AutoScalingGroup", "Properties" : { "AvailabilityZones": { "Fn::GetAZs": { "Ref": "AWS::Region" } }, “LaunchConfigurationName" : { "Ref" : "MyFleetLaunchConfig" }, "MinSize" : { "Ref" : "InstanceCountMin" }, "MaxSize" : { "Ref" : "InstanceCountMax" }, "DesiredCapacity" : { "Ref" : "InstanceCountDesired" }, "LoadBalancerNames" : [ { "Ref" : "ProductionPublicElasticLoadBalancer" }, { "Ref" : "StagingPublicElasticLoadBalancer"} ], } } "MyFleetPublicElasticLoadBalancer" : { "Type" : "AWS::ElasticLoadBalancing::LoadBalancer", "Properties" : { "CrossZone" : true, } } "MyFleetLaunchConfig" : { "Type" : "AWS::AutoScaling::LaunchConfiguration", "Properties" : { "UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [ "#!/bin/bashn", “# e.g fleetserver-1104eb3a.mydomain.comn”, "HOSTNAME="fleetserver-$(curl -s http://169.254.169.254/latest/meta-data/instance-id | cut -d '-' -f2).mydomain.com"n", "echo "${HOSTNAME}" > /etc/hostnamen", "hostname "${HOSTNAME}"n", "apt-get updaten", "apt-get install -y puppet knockdn”, "knock puppet.mydomain.com 7777 3333n", "puppet agent -tv --server puppet.mydomain.com --waitforcert 300 --configtimeout 300n" ] ] } } } } node /^fleetserver/ { class { '::myfleet': } } service { ‘myservice’: ensure => running, }
  • 11. Disaster recovery "MyRDSInstance" : { "Type": "AWS::RDS::DBInstance", "Properties": { "MultiAZ" : “true”, }, "DeletionPolicy": "Snapshot" } "Conditions" : { "CreateReadReplica" : { "Fn::Equals" : [{ "Ref" : "RDSReadReplica"}, "true" ]} } "MyRDSInstanceReplica": { "Type": "AWS::RDS::DBInstance", "Condition" : "CreateReadReplica", "Properties": { "SourceDBInstanceIdentifier": { "Ref": "MyRDSInstance" }, } }
  • 12. Re-cap: verb: rēˈkap/ - state again as a summary; recapitulate. "a way of recapping the story" ● Disasters are expensive in time & money for EVERYONE! - Everything was gone, time, orders, changes on staging, new products, QA tests, the whole kit. ● Be agile and move quickly NOW! - Restore to a trusted backup point, isolate the faulty data, extract everything you can, product SKUs etc. Never import bad data! ● If 20% sacrifice can fix 100% of your problems, go! Go Now! - Puppet does not have to control Cluster Control 100% and why should it? - Getting a commercial license VS. actually hiring a DBA that knows Puppet & Galera ● Log everything. Seriously EVERYTHING! ● Devs and PMs are your most demanding customers. SALE NOW ON! - And they are also your most prized possession. Keep them happy! ● Use Puppet and, if you can, Cloudformation, everything must be in code! Living the dream .. <3 - If your infrastructure, applications and configurations is in code, you are only 1 commit away from a fix. ● Backup all the things One must backup xXx - Backup, BACKUP, BAAACCCKKKUUUPPP!!! THANK YOU!
  • 13. Who is using Cluster Control?