SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Improving Operations
Efficiency with Puppet
April 17th, 2015
Nicolas Brousse | Sr. Director Of Operations Engineering | nicolas@tubemogul.com
Julien Fabre | Site Reliability Engineer | julien.fabre@tubemogul.com
Who are we?
TubeMogul
●  Enterprise software company for digital branding
●  Over 27 Billions Ads served in 2014
●  Over 30 Billions Ad Auctions per day
●  Bid processed in less than 50 ms
●  Bid served in less than 80 ms (include network round trip)
●  5 PB of monthly video traffic served
●  1.1 EB of data stored
Operations Engineering
●  Ensure the smooth day to day operation of the platform
infrastructure
●  Provide a cost effective and cutting edge infrastructure
●  Team composed of SREs, SEs and DBAs
●  Managing over 2,500 servers (virtual and physical)
Our Infrastructure
Public Cloud On Premises
Multiple locations with a mix of Public Cloud and On Premises
●  Java (a lot!)
●  MySQL
●  Couchbase
●  Vertica
●  Kafka
●  Storm
●  Zookeeper, Exhibitor
●  Hadoop, HBase, Hive
●  Terracotta
●  ElasticSearch, Kibana
●  LogStash
●  PHP, Python, Ruby, Go...
●  Apache httpd
●  Nagios
●  Ganglia
Technology Hoarders
●  Graphite
●  Memcached
●  Puppet
●  HAproxy
●  OpenStack
●  Git and Gerrit
●  Gor
●  ActiveMQ
●  OpenLDAP
●  Redis
●  Blackbox
●  Jenkins, Sonar
●  Tomcat
●  Jetty (embedded)
●  AWS DynamoDB, EC2, S3...
●  2008 - 2010: Use SVN, Bash scripts and custom templates.
●  2010: Managing about 250 instances. Start looking at Puppet.
●  2011: Started with Puppet 0.25 then upgraded to 2.7 by EOY on
400 servers with 2 contributors.
●  2012: 800 servers managed by Puppet. 4 contributors.
●  2013: 1,000 servers managed by Puppet. 6 contributors.
●  2014: 1,500 servers managed by Puppet. Workflow using Git,
Gerrit and Jenkins. 9 contributors. Start migration to 3.7.
●  2015: 2,000 servers managed by Puppet. 13 contributors.
Five Years Of Puppet!
●  2000 nodes
●  225 unique nodes definition
●  1 puppetmaster
●  112 Puppet modules
Puppet Stats
●  Virtual and Physical Servers Configuration : Master mode
●  Building AWS AMI with Packer : Master mode
●  Local development environment with Vagrant : Master mode
●  OpenStack deployment : Masterless mode
Where and how do we use Puppet ?
Code Review?
●  Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack,
WikiMedia, LibreOffice, Spotify, GlusterFS, etc...
●  Fine Grained Permissions Rules
●  Plugged to LDAP
●  Code Review per commit
●  Stream Events
●  Use GitBlit
●  Integrated with Jenkins and Jira
●  Managing about 600 Git repositories
A Powerful Gerrit Integration
Gerrit in Action
●  1 job per module
●  1 job for the manifests and hiera data
●  1 job for the Puppet fileserver
●  1 job to deploy
Continuous Delivery with Jenkins
Global Jenkins stats for the past year
●  ~10,000 Puppet deployment
●  Over 8,500 Production App Deployment
Team Awareness: HipChat Integration with Hubot
Infrastructure As Code
●  Follow standard development lifecycle
●  Repeatable and consistent server
provisioning
Continuous Delivery
●  Iterate quickly
●  Automated code review to improve code
quality
Reliability
●  Improve Production Stability
●  Enforce Better Security Practices
Puppet Continuous Delivery Workflow: The Vision
The Workflow
The Workflow : Puppet code logic
Puppet environments
●  Dedicated node manifests (*.pp)
●  Modules deployed by branch with Git submodules
All the data in Hiera
●  Try to avoid params.pp class
●  Store everything : modules parameters, classes, keys, passwords, ...
Puppet Code Hierarchy
/etc/puppet
├── puppet.conf, hiera.yaml, *.conf
├── hiera
└── environments
├── dev
│ ├── manifests
│ │ ├── nodes/*.pp
│ │ └── site.pp
│ └── modules
│ ├── activemq
│ ├── apache
│ ├── apf
│ ...
│ └── zookeeper
└── production
├── manifests
│ ├── nodes/*.pp
│ └── site.pp
└── modules
├── activemq
…
└── zookeeper
Git submodules, branch dev
Git submodules, branch production
Hiera Configuration
$ cat /etc/puppet/hiera.yaml
---
:backends:
- eyaml
- yaml
:yaml:
:datadir: /etc/puppet/hiera
:eyaml:
:datadir: /etc/puppet/hiera
:extension: 'yaml'
:pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem
:pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem
:hierarchy:
- fqdn/%{::fqdn}
- "%{::zone}/%{::vpc}/%{::hostgroup}"
- "%{::zone}/%{::vpc}/all"
- "%{::zone}/%{::hostgroup}"
- "%{::zone}/all"
- hostname/%{::hostname}
- hostgroup/%{::hostgroup}
- environment/%{::environment}
- common
:merge_behavior: deeper
Hiera eyaml : github.com/TomPoulton/hiera-eyaml
●  Hiera backend
●  Easy to use
●  Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml
Encrypt Your Secrets
$ cat secret.yaml
---
ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMII
IBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s
+Hfzr0lqgcvRCIuJGpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/
l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMASawmarqbLYwllTrTe32H4NWxU1e
+qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]
Encrypt Files
Blackbox : github.com/StackExchange/blackbox
●  Use GPG to encrypt secret files
●  Easy to add/delete team members
●  No need to change your Puppet code !
# modules/${modules_name}/files/credentials.yaml.gpg
file { ‘/etc/app/credentials.yaml’:
ensure => ‘file’,
owner => ‘root’,
group => ‘root’,
mode => ‘0644’,
source => ‘puppet:///modules/${module_name}/credentials.yaml’
}
The Workflow
The Workflow : bottlenecks
●  Only Ops team members can commit (SRE, SE)
●  Review and validation is done only by a SRE
●  Jenkins will verify the code but will not validate the commit
●  Static Puppet environments
●  Rely a lot on server hostnames
Flexibility : R10K github.com/adrienthebo/r10k !
●  Dynamic environments
●  No Git submodules anymore ! : - )
●  Easy to reproduce any environment
●  Can use private and forge Puppet modules
●  Can use branches and tags
●  Based on Puppetfile
Puppet Workflow Reloaded!
R10K
$ cat Puppetfile
forge "https://forgeapi.puppetlabs.com"
# Forge modules
mod 'pdxcat/collectd'
mod 'puppetlabs/rabbitmq'
mod 'arioch/redis'
mod 'maestrodev/wget'
mod 'puppetlabs/apt'
mod 'puppetlabs/stdlib'
# Tubemogul modules
mod "hosts",
:git => 'ssh://<gerrit_host>/puppet/modules/hosts',
:branch => 'dev'
mod "timezone",
:git => 'ssh://<gerrit_host>/puppet/modules/timezone',
:branch => 'dev'
...
Puppet Workflow Reloaded!
Better code organization : Roles and Profiles
●  Represent the business logic : Roles
o  Highest abstraction layer
o  Use Profiles for implementation
●  Implement the applications : Profiles
o  Remove potential code duplication
o  Use modules and other Puppet resources
Roles/Profiles Pattern
class role::logs {
include profile::base
include profile::logstash::server
include profile::elasticsearch
}
class profile::logstash {
$version = hiera('profile::logstash::server::version', '1.4.2')
$es_host = hiera('profile::logstash::server::es_host', 'es01')
$redis_host = hiera('profile::logstash::server::redis_host', 'redis01')
class { 'logstash':
package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb",
java_install => true,
}
logstash::configfile { 'input_redis':
content => template('logstash/configfile/logstash.input_redis.conf.erb'),
order => 10,
}
logstash::configfile { 'output_es':
content => template('logstash/configfile/logstash.output_es.conf.erb'),
order => 30,
}
}
Do not rely on hostname : nodeless approach
●  Facts to guide Puppet
●  No node myawesomeserver { } anymore
●  Enforce a cluster vision
●  site.pp gives the configuration logic
Puppet Workflow Reloaded!
# /etc/puppet/manifests/site.pp
node default {
if $::ec2_tag_tm_role {
notify { "Using role : ${ec2_tag_tm_role}": }
include "role::${::ec2_tag_tm_role}"
} else {
fail(‘No role found. Nothing to configure.’)
}
}
●  Specify tags during the provisioning
●  Retrieve tags with AWS Ruby SDK and create facts
●  New hierarchy
AWS EC2 tags
$ facter -p | grep ec2_tag
ec2_tag_cluster => rtb-bidder
ec2_tag_nagios_host => mgmt01
ec2_tag_name => bidder
ec2_tag_pupenv => production
ec2_tag_tm_role => rtb::bidder
:hierarchy:
- "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}"
- "%{::zone}/%{::ec2_tag_vpc}/all"
- "%{::zone}/all"
- vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}
- vpc/%{::ec2_tag_vpc}/all
- environment/%{::environment}
- common
New merging and reviewing rules
●  Everyone can commit a Puppet code
●  Allow everyone to review a Puppet change (+1)
●  Allow SE and SRE to validate a Puppet change (+2)
●  Auto validation/merging in dev if at least 80% of test (+2)
Next improvements
●  Acceptance testing with Beaker and Docker
●  Full test provisioning with ServerSpec
●  PuppetDB to improve the reporting
●  Dedicated Puppet Masters
OpenSource Modules
●  tubemogul-aptly
●  tubemogul-blackbox
●  tubemogul-codedeploy
●  tubemogul-gor
●  tubemogul-packer
●  tubemogul-tmfile
●  tubemogul-storm
●  tubemogul-kafka
Nicolas Brousse
Julien Fabre
@orieg
@julien_fabre

Contenu connexe

Tendances

Tupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FBTupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FB
Docker, Inc.
 

Tendances (20)

Getting to push_button_deploys
Getting to push_button_deploysGetting to push_button_deploys
Getting to push_button_deploys
 
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner) Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
 
Openstack study-nova-02
Openstack study-nova-02Openstack study-nova-02
Openstack study-nova-02
 
Managing Your Cisco Datacenter Network with Ansible
Managing Your Cisco Datacenter Network with AnsibleManaging Your Cisco Datacenter Network with Ansible
Managing Your Cisco Datacenter Network with Ansible
 
Tupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FBTupperware: Containerized Deployment at FB
Tupperware: Containerized Deployment at FB
 
Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012Introduction to ZooKeeper - TriHUG May 22, 2012
Introduction to ZooKeeper - TriHUG May 22, 2012
 
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
Spark on Kubernetes - Advanced Spark and Tensorflow Meetup - Jan 19 2017 - An...
 
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
 
Automating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngageAutomating aws infrastructure and code deployments using Ansible @WebEngage
Automating aws infrastructure and code deployments using Ansible @WebEngage
 
Clocker - The Docker Cloud Maker
Clocker - The Docker Cloud MakerClocker - The Docker Cloud Maker
Clocker - The Docker Cloud Maker
 
[OpenInfra Days Korea 2018] Day 2 - E4 - 딥다이브: immutable Kubernetes architecture
[OpenInfra Days Korea 2018] Day 2 - E4 - 딥다이브: immutable Kubernetes architecture[OpenInfra Days Korea 2018] Day 2 - E4 - 딥다이브: immutable Kubernetes architecture
[OpenInfra Days Korea 2018] Day 2 - E4 - 딥다이브: immutable Kubernetes architecture
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
 
OpenStack Cinder
OpenStack CinderOpenStack Cinder
OpenStack Cinder
 
Deploying MongoDB sharded clusters easily with Terraform and Ansible
Deploying MongoDB sharded clusters easily with Terraform and AnsibleDeploying MongoDB sharded clusters easily with Terraform and Ansible
Deploying MongoDB sharded clusters easily with Terraform and Ansible
 
Serverless technologies with Kubernetes
Serverless technologies with KubernetesServerless technologies with Kubernetes
Serverless technologies with Kubernetes
 
Chef for OpenStack: Grizzly Roadmap
Chef for OpenStack: Grizzly RoadmapChef for OpenStack: Grizzly Roadmap
Chef for OpenStack: Grizzly Roadmap
 
Wanting distributed volumes - Experiences with ceph-docker
Wanting distributed volumes - Experiences with ceph-dockerWanting distributed volumes - Experiences with ceph-docker
Wanting distributed volumes - Experiences with ceph-docker
 
DevOps with Kubernetes
DevOps with KubernetesDevOps with Kubernetes
DevOps with Kubernetes
 
NATS: Simple, Secure and Scalable Messaging For the Cloud Native Era
NATS: Simple, Secure and Scalable Messaging For the Cloud Native EraNATS: Simple, Secure and Scalable Messaging For the Cloud Native Era
NATS: Simple, Secure and Scalable Messaging For the Cloud Native Era
 
Seamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodesSeamless scaling of Kubernetes nodes
Seamless scaling of Kubernetes nodes
 

En vedette

A Presentation about Puppet that I've made at the OSSPAC conference
A Presentation about Puppet that I've made at the OSSPAC conferenceA Presentation about Puppet that I've made at the OSSPAC conference
A Presentation about Puppet that I've made at the OSSPAC conference
ohadlevy
 

En vedette (14)

Introduction to Puppet Scripting
Introduction to Puppet ScriptingIntroduction to Puppet Scripting
Introduction to Puppet Scripting
 
How I hack on puppet modules
How I hack on puppet modulesHow I hack on puppet modules
How I hack on puppet modules
 
Extending Foreman the easy way with foreman_hooks
Extending Foreman the easy way with foreman_hooksExtending Foreman the easy way with foreman_hooks
Extending Foreman the easy way with foreman_hooks
 
OpenNebula, the foreman and CentOS play nice, too
OpenNebula, the foreman and CentOS play nice, tooOpenNebula, the foreman and CentOS play nice, too
OpenNebula, the foreman and CentOS play nice, too
 
A Presentation about Puppet that I've made at the OSSPAC conference
A Presentation about Puppet that I've made at the OSSPAC conferenceA Presentation about Puppet that I've made at the OSSPAC conference
A Presentation about Puppet that I've made at the OSSPAC conference
 
KubeCon EU 2016: Heroku to Kubernetes
KubeCon EU 2016: Heroku to KubernetesKubeCon EU 2016: Heroku to Kubernetes
KubeCon EU 2016: Heroku to Kubernetes
 
Docker Puppet Automatisation on Hidora
Docker Puppet Automatisation on HidoraDocker Puppet Automatisation on Hidora
Docker Puppet Automatisation on Hidora
 
Full Stack Automation with Katello & The Foreman
Full Stack Automation with Katello & The ForemanFull Stack Automation with Katello & The Foreman
Full Stack Automation with Katello & The Foreman
 
Lifecycle Management with Foreman
Lifecycle Management with ForemanLifecycle Management with Foreman
Lifecycle Management with Foreman
 
Managing your SaltStack Minions with Foreman
Managing your SaltStack Minions with ForemanManaging your SaltStack Minions with Foreman
Managing your SaltStack Minions with Foreman
 
Beaker: Automated, Cloud-Based Acceptance Testing - PuppetConf 2014
Beaker: Automated, Cloud-Based Acceptance Testing - PuppetConf 2014Beaker: Automated, Cloud-Based Acceptance Testing - PuppetConf 2014
Beaker: Automated, Cloud-Based Acceptance Testing - PuppetConf 2014
 
Lifecycle Management mit Puppet und Foreman
Lifecycle Management mit Puppet und ForemanLifecycle Management mit Puppet und Foreman
Lifecycle Management mit Puppet und Foreman
 
Running at Scale: Practical Performance Tuning with Puppet - PuppetConf 2013
Running at Scale: Practical Performance Tuning with Puppet - PuppetConf 2013Running at Scale: Practical Performance Tuning with Puppet - PuppetConf 2013
Running at Scale: Practical Performance Tuning with Puppet - PuppetConf 2013
 
Designing Puppet: Roles/Profiles Pattern
Designing Puppet: Roles/Profiles PatternDesigning Puppet: Roles/Profiles Pattern
Designing Puppet: Roles/Profiles Pattern
 

Similaire à Improving Operations Efficiency with Puppet

Provisioning with Puppet
Provisioning with PuppetProvisioning with Puppet
Provisioning with Puppet
Joe Ray
 

Similaire à Improving Operations Efficiency with Puppet (20)

Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployme...
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Provisioning with Puppet
Provisioning with PuppetProvisioning with Puppet
Provisioning with Puppet
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
 
Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...
Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...
Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...
 
Puppet for Developers
Puppet for DevelopersPuppet for Developers
Puppet for Developers
 
Automating complex infrastructures with Puppet
Automating complex infrastructures with PuppetAutomating complex infrastructures with Puppet
Automating complex infrastructures with Puppet
 
Ansiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at robloxAnsiblefest 2018 Network automation journey at roblox
Ansiblefest 2018 Network automation journey at roblox
 
Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)Our Puppet Story (GUUG FFG 2015)
Our Puppet Story (GUUG FFG 2015)
 
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
Nagios Conference 2014 - Spenser Reinhardt - Detecting Security Breaches With...
 
Puppet: From 0 to 100 in 30 minutes
Puppet: From 0 to 100 in 30 minutesPuppet: From 0 to 100 in 30 minutes
Puppet: From 0 to 100 in 30 minutes
 
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
Nagios Conference 2014 - Mike Merideth - The Art and Zen of Managing Nagios w...
 
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud RunDesigning flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
 
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
Heroku to Kubernetes & Gihub to Gitlab success story
Heroku to Kubernetes & Gihub to Gitlab success storyHeroku to Kubernetes & Gihub to Gitlab success story
Heroku to Kubernetes & Gihub to Gitlab success story
 
Introduction to PaaS and Heroku
Introduction to PaaS and HerokuIntroduction to PaaS and Heroku
Introduction to PaaS and Heroku
 
Deploying software at Scale
Deploying software at ScaleDeploying software at Scale
Deploying software at Scale
 
Let's build Developer Portal with Backstage
Let's build Developer Portal with BackstageLet's build Developer Portal with Backstage
Let's build Developer Portal with Backstage
 
202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUP202107 - Orion introduction - COSCUP
202107 - Orion introduction - COSCUP
 

Plus de Nicolas Brousse

IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...
IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...
IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...
Nicolas Brousse
 

Plus de Nicolas Brousse (12)

<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
<Programming> 2019 - ICW'19: The Issue of Monorepo and Polyrepo In Large Ente...
 
Improving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine LearningImproving Adobe Experience Cloud Services Dependability with Machine Learning
Improving Adobe Experience Cloud Services Dependability with Machine Learning
 
IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...
IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...
IEEE ISSRE 2018 - Use of Self-Healing Techniques to Improve the Reliability o...
 
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
PuppetConf 2017 | Adobe Advertising Cloud: A Lean Puppet Workflow to Support ...
 
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStackAdobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
Adobe Advertising Cloud: The Reality of Cloud Bursting with OpenStack
 
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuiteSuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
SuiteWorld16: Mega Volume - How TubeMogul Leverages NetSuite
 
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
SRECon16: Moving Large Workloads from a Public Cloud to an OpenStack Private ...
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Scaling Bleeding Edge Technology in a Fast-paced Environment
Scaling Bleeding Edge Technology in a Fast-paced EnvironmentScaling Bleeding Edge Technology in a Fast-paced Environment
Scaling Bleeding Edge Technology in a Fast-paced Environment
 
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
Scaling on EC2 in a fast-paced environment (LISA'11 - Full Paper)
 
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
Bringing Business Awareness to Your Operation Team (Nagios World Conference 2...
 
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
Optimizing your Monitoring and Trending tools for the Cloud (Nagios World Con...
 

Dernier

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 

Dernier (20)

Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 

Improving Operations Efficiency with Puppet

  • 1. Improving Operations Efficiency with Puppet April 17th, 2015 Nicolas Brousse | Sr. Director Of Operations Engineering | nicolas@tubemogul.com Julien Fabre | Site Reliability Engineer | julien.fabre@tubemogul.com
  • 2. Who are we? TubeMogul ●  Enterprise software company for digital branding ●  Over 27 Billions Ads served in 2014 ●  Over 30 Billions Ad Auctions per day ●  Bid processed in less than 50 ms ●  Bid served in less than 80 ms (include network round trip) ●  5 PB of monthly video traffic served ●  1.1 EB of data stored Operations Engineering ●  Ensure the smooth day to day operation of the platform infrastructure ●  Provide a cost effective and cutting edge infrastructure ●  Team composed of SREs, SEs and DBAs ●  Managing over 2,500 servers (virtual and physical)
  • 3. Our Infrastructure Public Cloud On Premises Multiple locations with a mix of Public Cloud and On Premises
  • 4. ●  Java (a lot!) ●  MySQL ●  Couchbase ●  Vertica ●  Kafka ●  Storm ●  Zookeeper, Exhibitor ●  Hadoop, HBase, Hive ●  Terracotta ●  ElasticSearch, Kibana ●  LogStash ●  PHP, Python, Ruby, Go... ●  Apache httpd ●  Nagios ●  Ganglia Technology Hoarders ●  Graphite ●  Memcached ●  Puppet ●  HAproxy ●  OpenStack ●  Git and Gerrit ●  Gor ●  ActiveMQ ●  OpenLDAP ●  Redis ●  Blackbox ●  Jenkins, Sonar ●  Tomcat ●  Jetty (embedded) ●  AWS DynamoDB, EC2, S3...
  • 5. ●  2008 - 2010: Use SVN, Bash scripts and custom templates. ●  2010: Managing about 250 instances. Start looking at Puppet. ●  2011: Started with Puppet 0.25 then upgraded to 2.7 by EOY on 400 servers with 2 contributors. ●  2012: 800 servers managed by Puppet. 4 contributors. ●  2013: 1,000 servers managed by Puppet. 6 contributors. ●  2014: 1,500 servers managed by Puppet. Workflow using Git, Gerrit and Jenkins. 9 contributors. Start migration to 3.7. ●  2015: 2,000 servers managed by Puppet. 13 contributors. Five Years Of Puppet!
  • 6. ●  2000 nodes ●  225 unique nodes definition ●  1 puppetmaster ●  112 Puppet modules Puppet Stats
  • 7. ●  Virtual and Physical Servers Configuration : Master mode ●  Building AWS AMI with Packer : Master mode ●  Local development environment with Vagrant : Master mode ●  OpenStack deployment : Masterless mode Where and how do we use Puppet ?
  • 9. ●  Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack, WikiMedia, LibreOffice, Spotify, GlusterFS, etc... ●  Fine Grained Permissions Rules ●  Plugged to LDAP ●  Code Review per commit ●  Stream Events ●  Use GitBlit ●  Integrated with Jenkins and Jira ●  Managing about 600 Git repositories A Powerful Gerrit Integration
  • 11. ●  1 job per module ●  1 job for the manifests and hiera data ●  1 job for the Puppet fileserver ●  1 job to deploy Continuous Delivery with Jenkins Global Jenkins stats for the past year ●  ~10,000 Puppet deployment ●  Over 8,500 Production App Deployment
  • 12. Team Awareness: HipChat Integration with Hubot
  • 13. Infrastructure As Code ●  Follow standard development lifecycle ●  Repeatable and consistent server provisioning Continuous Delivery ●  Iterate quickly ●  Automated code review to improve code quality Reliability ●  Improve Production Stability ●  Enforce Better Security Practices Puppet Continuous Delivery Workflow: The Vision
  • 15. The Workflow : Puppet code logic Puppet environments ●  Dedicated node manifests (*.pp) ●  Modules deployed by branch with Git submodules All the data in Hiera ●  Try to avoid params.pp class ●  Store everything : modules parameters, classes, keys, passwords, ...
  • 16. Puppet Code Hierarchy /etc/puppet ├── puppet.conf, hiera.yaml, *.conf ├── hiera └── environments ├── dev │ ├── manifests │ │ ├── nodes/*.pp │ │ └── site.pp │ └── modules │ ├── activemq │ ├── apache │ ├── apf │ ... │ └── zookeeper └── production ├── manifests │ ├── nodes/*.pp │ └── site.pp └── modules ├── activemq … └── zookeeper Git submodules, branch dev Git submodules, branch production
  • 17. Hiera Configuration $ cat /etc/puppet/hiera.yaml --- :backends: - eyaml - yaml :yaml: :datadir: /etc/puppet/hiera :eyaml: :datadir: /etc/puppet/hiera :extension: 'yaml' :pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem :pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem :hierarchy: - fqdn/%{::fqdn} - "%{::zone}/%{::vpc}/%{::hostgroup}" - "%{::zone}/%{::vpc}/all" - "%{::zone}/%{::hostgroup}" - "%{::zone}/all" - hostname/%{::hostname} - hostgroup/%{::hostgroup} - environment/%{::environment} - common :merge_behavior: deeper
  • 18. Hiera eyaml : github.com/TomPoulton/hiera-eyaml ●  Hiera backend ●  Easy to use ●  Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml Encrypt Your Secrets $ cat secret.yaml --- ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMII IBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s +Hfzr0lqgcvRCIuJGpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/ l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMASawmarqbLYwllTrTe32H4NWxU1e +qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]
  • 19. Encrypt Files Blackbox : github.com/StackExchange/blackbox ●  Use GPG to encrypt secret files ●  Easy to add/delete team members ●  No need to change your Puppet code ! # modules/${modules_name}/files/credentials.yaml.gpg file { ‘/etc/app/credentials.yaml’: ensure => ‘file’, owner => ‘root’, group => ‘root’, mode => ‘0644’, source => ‘puppet:///modules/${module_name}/credentials.yaml’ }
  • 21. The Workflow : bottlenecks ●  Only Ops team members can commit (SRE, SE) ●  Review and validation is done only by a SRE ●  Jenkins will verify the code but will not validate the commit ●  Static Puppet environments ●  Rely a lot on server hostnames
  • 22. Flexibility : R10K github.com/adrienthebo/r10k ! ●  Dynamic environments ●  No Git submodules anymore ! : - ) ●  Easy to reproduce any environment ●  Can use private and forge Puppet modules ●  Can use branches and tags ●  Based on Puppetfile Puppet Workflow Reloaded!
  • 23. R10K $ cat Puppetfile forge "https://forgeapi.puppetlabs.com" # Forge modules mod 'pdxcat/collectd' mod 'puppetlabs/rabbitmq' mod 'arioch/redis' mod 'maestrodev/wget' mod 'puppetlabs/apt' mod 'puppetlabs/stdlib' # Tubemogul modules mod "hosts", :git => 'ssh://<gerrit_host>/puppet/modules/hosts', :branch => 'dev' mod "timezone", :git => 'ssh://<gerrit_host>/puppet/modules/timezone', :branch => 'dev' ...
  • 24. Puppet Workflow Reloaded! Better code organization : Roles and Profiles ●  Represent the business logic : Roles o  Highest abstraction layer o  Use Profiles for implementation ●  Implement the applications : Profiles o  Remove potential code duplication o  Use modules and other Puppet resources
  • 25. Roles/Profiles Pattern class role::logs { include profile::base include profile::logstash::server include profile::elasticsearch } class profile::logstash { $version = hiera('profile::logstash::server::version', '1.4.2') $es_host = hiera('profile::logstash::server::es_host', 'es01') $redis_host = hiera('profile::logstash::server::redis_host', 'redis01') class { 'logstash': package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb", java_install => true, } logstash::configfile { 'input_redis': content => template('logstash/configfile/logstash.input_redis.conf.erb'), order => 10, } logstash::configfile { 'output_es': content => template('logstash/configfile/logstash.output_es.conf.erb'), order => 30, } }
  • 26. Do not rely on hostname : nodeless approach ●  Facts to guide Puppet ●  No node myawesomeserver { } anymore ●  Enforce a cluster vision ●  site.pp gives the configuration logic Puppet Workflow Reloaded! # /etc/puppet/manifests/site.pp node default { if $::ec2_tag_tm_role { notify { "Using role : ${ec2_tag_tm_role}": } include "role::${::ec2_tag_tm_role}" } else { fail(‘No role found. Nothing to configure.’) } }
  • 27. ●  Specify tags during the provisioning ●  Retrieve tags with AWS Ruby SDK and create facts ●  New hierarchy AWS EC2 tags $ facter -p | grep ec2_tag ec2_tag_cluster => rtb-bidder ec2_tag_nagios_host => mgmt01 ec2_tag_name => bidder ec2_tag_pupenv => production ec2_tag_tm_role => rtb::bidder :hierarchy: - "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}" - "%{::zone}/%{::ec2_tag_vpc}/all" - "%{::zone}/all" - vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster} - vpc/%{::ec2_tag_vpc}/all - environment/%{::environment} - common
  • 28. New merging and reviewing rules ●  Everyone can commit a Puppet code ●  Allow everyone to review a Puppet change (+1) ●  Allow SE and SRE to validate a Puppet change (+2) ●  Auto validation/merging in dev if at least 80% of test (+2)
  • 29. Next improvements ●  Acceptance testing with Beaker and Docker ●  Full test provisioning with ServerSpec ●  PuppetDB to improve the reporting ●  Dedicated Puppet Masters
  • 30. OpenSource Modules ●  tubemogul-aptly ●  tubemogul-blackbox ●  tubemogul-codedeploy ●  tubemogul-gor ●  tubemogul-packer ●  tubemogul-tmfile ●  tubemogul-storm ●  tubemogul-kafka