SlideShare une entreprise Scribd logo
1  sur  24
Automation at Brainly 
… or how to enter the world of automation in a “different way”.
World’s largest homework help social network, connecting over 25 million users monthly 
OPS stack: 
About Brainly 
● ~80 servers, heavy usage of LXC containers 
(~1000) 
● 99.9% Debian, 1 Ubuntu host :) 
● Nginx / Apache2, 2k reqs per sec 
● 147 million page views monthly 
● 700Mbps peak traffic 
● Python is dominant 
DEV stack: 
● PHP 
- Symfony 2 
- SOA projects 
- 200 reqs per sec on russian version 
● Erlang 
- 45k concurrent users 
- 22k events per sec 
● Native Apps 
- iOS 
- Android
Starting point 
● Puppet was not feasible for us 
- *lots* of dependencies which make container bigger/heavier 
- problems with Puppet's declarative language, order of parsing is important 
- seemed incoherent, lacking integration of orchestration 
- steep learning curve 
- YMMV 
- we're a Python shop ;) 
● "packaging as automation" as an intermediate solution 
- dependency hell, installing one package could result in uninstalling others 
- inflexible, lots of code duplication in debian/rules file 
- LOTS of custom bash and PHP scripts, usually very hard to reuse 
and not standardized 
- this was a dead end :( 
● Ansible 
- initially used only for orchestration 
- maintaining it required keeping up2date inventory, which later 
simplified and helped with lots of things
First steps with Ansible 
● we decided to move forward with Ansible and use it for setting up machines as 
well 
● first project was nagios monitoring plugins setup (check-growth, 
available on github BTW) 
● turned out to be ideal for containers and our needs in general 
- very little dependencies to begin with (python2, python-apt), 
and small footprint - "configured" Python modules are transferred 
directly to machine, no need for local repositories 
- very light, no compilation on the destination host is needed 
- easy to understand. Tasks/playbooks map directly to actions 
an ops/devops would have done if he was doing it by hand 
- compatible with "automation by packages". We were able to 
migrate from the old system in small steps.
Avoiding regressions 
● we decided to learn from our mistakes and do it right this time 
● all policies, rules, and good practices written down in automation's repo main directory 
● helps with introducing new people into the team or with devops approach 
- newbies are able to start committing to repo quickly 
- what's in GUIDELINES.md, that's law and changing it requires wider consensus 
- gives examples on how to deal with certain problems in standardized way 
● few examples: 
- limit the number of tags, each of them should be self-contained 
with no cross-dependencies. 
- each branch should be named <your-name>/<branch-name>, 
with master being in production 
- do not include roles/tasks inside other roles, 
this creates hard to follow dependencies 
- NEVER subset the list of hosts inside the role, do it in site.yml. 
Otherwise debugging roles/hosts will become difficult 
- think twice before adding new role and esp. groups. As infrastructure grows, 
it becomes hard to manage and/or creates "dead code/roles" 
- create reusable roles, by moving common code into *_base roles (i.e. apache2_base).
Ugly-hacks reusability 
● one of the policies introduced was storing one-off scripts in a 
separate directory in our automation repo. 
● these scripts are usually Ansible playbooks used just for one 
particular task (i.e. Squeeze->Wheezy migration). Rest is in 
Python, few of them in bash. 
● version-control everything! 
● turned out to be very useful, some of them turned out to be useful 
enough to be rewritten to proper role or a tool
Apache2 automation 
● available on GitHub and Ansible Galaxy: 
https://galaxy.ansible.com/list#/roles/940 
https://galaxy.ansible.com/list#/roles/941 
● “base” role: 
- is reused across 8 different production roles we have ATM 
- contains basic monitoring, logrotation, packages installation, etc… 
- includes PHP setup in modphp/prefork configuration 
- PHP disabled functions control 
- basic security setup 
- does not include any site-specific stuff 
● "site” role: 
- contains all site specific stuff and dependencies 
(vhosts, additional packages, etc...) 
- usually very simple 
- more than one site role possible, only one base role though 
● It is an example of how we make our roles reusable
● automatically setups monitoring basing on inventory and host groups 
● implements devops approach - if dev has root on machine, he also has 
access to all monitoring stuff related to this system 
● automatic host dependencies basing on host groups 
● provisioning new hosts is no longer so painful ("auto-discovery") 
● code reuse! - uses apache2_base to serve interface 
● all services configuration is stored as YAML files, and used in templates 
● role uses DNS data from YAML hash in order to make monitoring 
independant of DNS failures 
Icinga
DNS migration 
● at the beginning: 
- dozens of authoritative name servers, each of them having 
customized configuration, running ~100 zones, all created by hand 
- changing anything required lot's of work and was error prone 
- the main reason for that was using DNS for switching between primary/secondary 
servers/services 
● three phases: 
- slurping configuration into Ansible 
- normalizing the configuration 
- improving the setup 
● an example of how we can interface Python and Ansible to perform more complex stuff 
● Python script which uses Ansible API to fetch normalized zone configuration from each server 
- results available in a neat hash, with per-host, per-zone keys! 
- normalization using named-checkconf tool 
● results, after parsing, are stored in one, big YAML 
● parsed/included by all modules requiring dns data 
* suboptimal, see WIP section in next slide 
● use slurped configuration to re-generate all configs, this time using only the data available to 
Ansible's 
● "push-button" migration, after all recipes were ready :)
DNS automation 
● all zone transfers are signed, all hosts have individual keys 
● checks/markets use dns data directly in templates 
● ACLs are tight, and auto-generated 
● changing/migrating slaves/masters is easy, NS records are auto-generated 
● updates to zones automatically bump serial, while still preserving the 
YYYYMMDDxx format 
● CRM records are auto-generated as well 
* see next slide about CRM automation 
● still WIP 
- streamline the use DNS data directly in playbooks by creating 
lookup plugin, no more "chicken and egg problem" 
- new hosts should be automatically added to DNS 
- reduce number of DNS servers ;) 
- auto-generation of reverse zones 
- opensourcing!
Corosync & Pacemaker 
● we have ~130 CRM clusters 
● setting them up by hand would be "difficult" at best, impossible at worst 
● available on Ansible Galaxy: 
- https://galaxy.ansible.com/list#/roles/956 
- https://galaxy.ansible.com/list#/roles/979 
● follows pattern from apache2_base 
- “base” role suitable for manually set up clusters 
- "cluster” role provides service upon base, with few reusable snippets 
and a possibility for more complex configurations 
● automatic membership based on ansible inventory (no multicasts!) 
● the most difficult part was providing synchronous handlers which will not 
screw up the cluster at one hand and update the configuration on other 
● few simple configurations are provided, like single service-single vip
User management automation 
● initially we did not have time nor resources to set up 
full fledged LDAP, 
● we needed: 
- user should be able to log in even during a network outage 
- removal/adding users, ssh-keys, custom settings, etc.. 
all had to be supported 
- it had to be reusable/accessible in other roles 
(i.e. Icinga/monitoring) 
- different privileges for dev,production and other environments 
- UID/GID unification 
● turned out to be simpler than we thought - users are managed using few 
simple tasks and group_vars data. Rest is handled via variables precedence. 
● migration/standardization required some effort though
● we are leasing our servers from Hetzner, 
no direct Layer 2 connectivity 
● all tunnel setups are done using Ansible, new server 
is automatically added to our network 
● firewalls need to be tight, set up by Ansible as well 
- OPS contribute the base firewall, DEVs can open 
the ports of interest for their application 
- ferm at it's base 
- WIP 
Networking
Backups 
● based on Bareos, opensource Bacula fork 
● new hosts are automatically set up for backup, 
extending storage space is no longer a problem 
● authentication using certificates, PITA without ansible
Scaling markets 
● we are present on 20 markets (and counting!), each of them grows 
at it's own pace 
● providing configuration for each and every of them would be 
problematic, having one-size-fits-them-all would be inefficient 
● "Market classes" - each size has its own set of params 
● scaling markets up/down requires just one change and a playbook run 
● done using group variables and inheritance hierarchies
Not everything is perfect 
● Jinja2 template error messages are "difficult" to say the least. There is 
no information regarding where is the problem, and i.e. description of 
it's nature. Often we have to bisect 
● templates sometimes grow to huge complexity 
● Jinja2 is designed for speed, but with tradeoffs. We *really* miss some 
Python operators, and creating custom plugins/filters poses some problems 
● the same goes with Ansible roles - with more complexity, sometimes it would be 
really handy to have Python around. Corosync handlers are a good example 
● multi-inheritance, problems with 2-headed trees 
- bit ugly, but we had no other choice 
● speed, improved with "pipelining=True" 
● some useful functionality requires paid subscription (Ansible Tower) 
- RESTfull API, usefull if you want to push new application version 
to productions via i.e. Jenkins 
- schedules - currently we need to push the changes ourselves
Dev,DevOps,Ops 
● developers by default have RO access to repo, RW on case-by-case basis 
● changes to systems owned by developers are done by developers, 
OPS only provide the platform and tools 
● all non-trivial changes require a Pull Request and a review from Ops 
● in order to limit access to most sensitive data (i.e. passwords, 
certificate keys, etc...), we encrypt mission critical data with Ansible Vault 
and push it directly to the repo 
- *strong* encryption 
- available to Ansible without the need for decryption 
(password still required though) 
- all security sensitive stuff can be skipped by developers with 
"--skip-tags" option to ansible-playbooks
Opensource! Opensource! Opensource! 
● some of the things we mentioned can be find on our Github account 
● we are working on opensourcing more stuff 
https://github.com/brainly
Conclusions 
● time needed to deploy new markets dropped considerably 
● increased productivity 
● better cooperation with developers 
● more workpower, Devs are no longer blocked so much, we can push 
tasks to them 
● infrastructure as a code 
● versioning 
● code-reuse, less copy-pasting
We are hiring! 
http://brainly.co/jobs/
Questions?
Thank you!

Contenu connexe

Tendances

MySQL DevOps at Outbrain
MySQL DevOps at OutbrainMySQL DevOps at Outbrain
MySQL DevOps at OutbrainShlomi Noach
 
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014Puppet
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
 
OpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd Erk
OpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd ErkOpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd Erk
OpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd ErkOpenNebula Project
 
pgpool-II demonstration
pgpool-II demonstrationpgpool-II demonstration
pgpool-II demonstrationelliando dias
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementJames Turnbull
 
Gearman - Northeast PHP 2012
Gearman - Northeast PHP 2012Gearman - Northeast PHP 2012
Gearman - Northeast PHP 2012Mike Willbanks
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflowTomas Doran
 
Postgresql 9.0 HA at LOADAYS 2012
Postgresql 9.0 HA at LOADAYS 2012Postgresql 9.0 HA at LOADAYS 2012
Postgresql 9.0 HA at LOADAYS 2012Julien Pivotto
 
pgpool: Features and Development
pgpool: Features and Developmentpgpool: Features and Development
pgpool: Features and Developmentelliando dias
 
Ansible: What, Why & How
Ansible: What, Why & HowAnsible: What, Why & How
Ansible: What, Why & HowAlfonso Cabrera
 
Packaging perl (LPW2010)
Packaging perl (LPW2010)Packaging perl (LPW2010)
Packaging perl (LPW2010)p3castro
 
PuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & Hadoop
PuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & HadoopPuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & Hadoop
PuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & HadoopWalter Heck
 
Linux Cluster next generation
Linux Cluster next generationLinux Cluster next generation
Linux Cluster next generationsamsolutionsby
 
Getting Git Right
Getting Git RightGetting Git Right
Getting Git RightSven Peters
 
Pro Puppet
Pro PuppetPro Puppet
Pro Puppetdsadas
 
Puppet for SysAdmins
Puppet for SysAdminsPuppet for SysAdmins
Puppet for SysAdminsPuppet
 

Tendances (20)

MySQL DevOps at Outbrain
MySQL DevOps at OutbrainMySQL DevOps at Outbrain
MySQL DevOps at Outbrain
 
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
OpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd Erk
OpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd ErkOpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd Erk
OpenNebulaConf 2013 - OpenNebula in a Multi-Customer-Environment by Bernd Erk
 
pgpool-II demonstration
pgpool-II demonstrationpgpool-II demonstration
pgpool-II demonstration
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration Management
 
Gearman - Northeast PHP 2012
Gearman - Northeast PHP 2012Gearman - Northeast PHP 2012
Gearman - Northeast PHP 2012
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflow
 
Nrpe
NrpeNrpe
Nrpe
 
Postgresql 9.0 HA at LOADAYS 2012
Postgresql 9.0 HA at LOADAYS 2012Postgresql 9.0 HA at LOADAYS 2012
Postgresql 9.0 HA at LOADAYS 2012
 
pgpool: Features and Development
pgpool: Features and Developmentpgpool: Features and Development
pgpool: Features and Development
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Ansible: What, Why & How
Ansible: What, Why & HowAnsible: What, Why & How
Ansible: What, Why & How
 
Packaging perl (LPW2010)
Packaging perl (LPW2010)Packaging perl (LPW2010)
Packaging perl (LPW2010)
 
3 Git
3 Git3 Git
3 Git
 
PuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & Hadoop
PuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & HadoopPuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & Hadoop
PuppetCamp SEA 1 - Using Vagrant, Puppet, Testing & Hadoop
 
Linux Cluster next generation
Linux Cluster next generationLinux Cluster next generation
Linux Cluster next generation
 
Getting Git Right
Getting Git RightGetting Git Right
Getting Git Right
 
Pro Puppet
Pro PuppetPro Puppet
Pro Puppet
 
Puppet for SysAdmins
Puppet for SysAdminsPuppet for SysAdmins
Puppet for SysAdmins
 

En vedette

Tercera Unidad.
Tercera Unidad.Tercera Unidad.
Tercera Unidad.titahope
 
How does electricity work - Ohm's Law Clearly Explained
How does electricity work - Ohm's Law Clearly ExplainedHow does electricity work - Ohm's Law Clearly Explained
How does electricity work - Ohm's Law Clearly ExplainedMuhammad Hammad Lateef
 
Instrumentos de Evaluación.
Instrumentos de Evaluación.Instrumentos de Evaluación.
Instrumentos de Evaluación.titahope
 
9 exacta-desember-2007
9 exacta-desember-20079 exacta-desember-2007
9 exacta-desember-2007Anthon Toim
 
Resumen de la lectura Física II (Segundo parcial)
Resumen de la lectura Física II (Segundo parcial)Resumen de la lectura Física II (Segundo parcial)
Resumen de la lectura Física II (Segundo parcial)samanthadb4998
 
Aptitude sample questions
Aptitude sample questionsAptitude sample questions
Aptitude sample questionschaselion
 
Competencias de la Unidad Tres.
Competencias de la Unidad Tres.Competencias de la Unidad Tres.
Competencias de la Unidad Tres.titahope
 
Humanitarian reform cluster_appraoch
Humanitarian reform cluster_appraochHumanitarian reform cluster_appraoch
Humanitarian reform cluster_appraochDPNet
 
Teoría de Física II (Primer Parcial)
Teoría de Física II (Primer Parcial)Teoría de Física II (Primer Parcial)
Teoría de Física II (Primer Parcial)samanthadb4998
 
Amtech Estimating User Group
Amtech Estimating User GroupAmtech Estimating User Group
Amtech Estimating User GroupAmtechGroup
 
Ace enablers energy saving machine uptime-harmonic mitigation
Ace enablers energy saving machine uptime-harmonic mitigationAce enablers energy saving machine uptime-harmonic mitigation
Ace enablers energy saving machine uptime-harmonic mitigationSrinivasan Jeyaram
 
Actividades de Física II (Primer Parcial)
Actividades de Física II (Primer Parcial)Actividades de Física II (Primer Parcial)
Actividades de Física II (Primer Parcial)samanthadb4998
 
Rheumatoid arthritis - Dafydd Loughran
Rheumatoid arthritis - Dafydd LoughranRheumatoid arthritis - Dafydd Loughran
Rheumatoid arthritis - Dafydd Loughranwelshbarbers
 
ランダムフォレスト
ランダムフォレストランダムフォレスト
ランダムフォレストKinki University
 

En vedette (18)

Tercera Unidad.
Tercera Unidad.Tercera Unidad.
Tercera Unidad.
 
AE poster
AE posterAE poster
AE poster
 
How does electricity work - Ohm's Law Clearly Explained
How does electricity work - Ohm's Law Clearly ExplainedHow does electricity work - Ohm's Law Clearly Explained
How does electricity work - Ohm's Law Clearly Explained
 
Instrumentos de Evaluación.
Instrumentos de Evaluación.Instrumentos de Evaluación.
Instrumentos de Evaluación.
 
9 exacta-desember-2007
9 exacta-desember-20079 exacta-desember-2007
9 exacta-desember-2007
 
Resumen de la lectura Física II (Segundo parcial)
Resumen de la lectura Física II (Segundo parcial)Resumen de la lectura Física II (Segundo parcial)
Resumen de la lectura Física II (Segundo parcial)
 
Cửa cuốn công nghệ úc
Cửa cuốn công nghệ úcCửa cuốn công nghệ úc
Cửa cuốn công nghệ úc
 
Cửa cuốn công nghệ đức
Cửa cuốn công nghệ đứcCửa cuốn công nghệ đức
Cửa cuốn công nghệ đức
 
Aptitude sample questions
Aptitude sample questionsAptitude sample questions
Aptitude sample questions
 
Competencias de la Unidad Tres.
Competencias de la Unidad Tres.Competencias de la Unidad Tres.
Competencias de la Unidad Tres.
 
Humanitarian reform cluster_appraoch
Humanitarian reform cluster_appraochHumanitarian reform cluster_appraoch
Humanitarian reform cluster_appraoch
 
Teoría de Física II (Primer Parcial)
Teoría de Física II (Primer Parcial)Teoría de Física II (Primer Parcial)
Teoría de Física II (Primer Parcial)
 
Amtech Estimating User Group
Amtech Estimating User GroupAmtech Estimating User Group
Amtech Estimating User Group
 
Ace enablers energy saving machine uptime-harmonic mitigation
Ace enablers energy saving machine uptime-harmonic mitigationAce enablers energy saving machine uptime-harmonic mitigation
Ace enablers energy saving machine uptime-harmonic mitigation
 
Actividades de Física II (Primer Parcial)
Actividades de Física II (Primer Parcial)Actividades de Física II (Primer Parcial)
Actividades de Física II (Primer Parcial)
 
Rheumatoid arthritis - Dafydd Loughran
Rheumatoid arthritis - Dafydd LoughranRheumatoid arthritis - Dafydd Loughran
Rheumatoid arthritis - Dafydd Loughran
 
Ieee power quality distribution systems
Ieee power quality distribution systemsIeee power quality distribution systems
Ieee power quality distribution systems
 
ランダムフォレスト
ランダムフォレストランダムフォレスト
ランダムフォレスト
 

Similaire à Automation@Brainly - Polish Linux Autumn 2014

PLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł RozlachPLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł RozlachPROIDEA
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingStanislav Osipov
 
Enabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via KubernetesEnabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via Kubernetesmountpoint.io
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsGR8Conf
 
Infrastructure as Data with Ansible
Infrastructure as Data with AnsibleInfrastructure as Data with Ansible
Infrastructure as Data with AnsibleCarlo Bonamico
 
Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Codemotion
 
#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to AnsibleCédric Delgehier
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios
 
Configuration Management and Salt
Configuration Management and SaltConfiguration Management and Salt
Configuration Management and Salt55020
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITOpenStack
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to AnsibleCoreStack
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...Big Data Montreal
 
MySQL Performance - Best practices
MySQL Performance - Best practices MySQL Performance - Best practices
MySQL Performance - Best practices Ted Wennmark
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerBob Killen
 
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...Wong Hoi Sing Edison
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleJérôme Petazzoni
 

Similaire à Automation@Brainly - Polish Linux Autumn 2014 (20)

PLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł RozlachPLNOG14: Automation at Brainly - Paweł Rozlach
PLNOG14: Automation at Brainly - Paweł Rozlach
 
Ansible intro
Ansible introAnsible intro
Ansible intro
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
Enabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via KubernetesEnabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via Kubernetes
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
 
Infrastructure as Data with Ansible
Infrastructure as Data with AnsibleInfrastructure as Data with Ansible
Infrastructure as Data with Ansible
 
Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...
 
Ansible.pdf
Ansible.pdfAnsible.pdf
Ansible.pdf
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
 
Configuration Management and Salt
Configuration Management and SaltConfiguration Management and Salt
Configuration Management and Salt
 
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst ITThings You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
Things You MUST Know Before Deploying OpenStack: Bruno Lago, Catalyst IT
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
 
MySQL Performance - Best practices
MySQL Performance - Best practices MySQL Performance - Best practices
MySQL Performance - Best practices
 
Pluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and DockerPluggable Infrastructure with CI/CD and Docker
Pluggable Infrastructure with CI/CD and Docker
 
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
 
Pdf c1t tlawaxb
Pdf c1t tlawaxbPdf c1t tlawaxb
Pdf c1t tlawaxb
 

Dernier

On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...Neha Pandey
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...SUHANI PANDEY
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Call Girls in Nagpur High Profile
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Delhi Call girls
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 

Dernier (20)

On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 

Automation@Brainly - Polish Linux Autumn 2014

  • 1. Automation at Brainly … or how to enter the world of automation in a “different way”.
  • 2. World’s largest homework help social network, connecting over 25 million users monthly OPS stack: About Brainly ● ~80 servers, heavy usage of LXC containers (~1000) ● 99.9% Debian, 1 Ubuntu host :) ● Nginx / Apache2, 2k reqs per sec ● 147 million page views monthly ● 700Mbps peak traffic ● Python is dominant DEV stack: ● PHP - Symfony 2 - SOA projects - 200 reqs per sec on russian version ● Erlang - 45k concurrent users - 22k events per sec ● Native Apps - iOS - Android
  • 3. Starting point ● Puppet was not feasible for us - *lots* of dependencies which make container bigger/heavier - problems with Puppet's declarative language, order of parsing is important - seemed incoherent, lacking integration of orchestration - steep learning curve - YMMV - we're a Python shop ;) ● "packaging as automation" as an intermediate solution - dependency hell, installing one package could result in uninstalling others - inflexible, lots of code duplication in debian/rules file - LOTS of custom bash and PHP scripts, usually very hard to reuse and not standardized - this was a dead end :( ● Ansible - initially used only for orchestration - maintaining it required keeping up2date inventory, which later simplified and helped with lots of things
  • 4. First steps with Ansible ● we decided to move forward with Ansible and use it for setting up machines as well ● first project was nagios monitoring plugins setup (check-growth, available on github BTW) ● turned out to be ideal for containers and our needs in general - very little dependencies to begin with (python2, python-apt), and small footprint - "configured" Python modules are transferred directly to machine, no need for local repositories - very light, no compilation on the destination host is needed - easy to understand. Tasks/playbooks map directly to actions an ops/devops would have done if he was doing it by hand - compatible with "automation by packages". We were able to migrate from the old system in small steps.
  • 5. Avoiding regressions ● we decided to learn from our mistakes and do it right this time ● all policies, rules, and good practices written down in automation's repo main directory ● helps with introducing new people into the team or with devops approach - newbies are able to start committing to repo quickly - what's in GUIDELINES.md, that's law and changing it requires wider consensus - gives examples on how to deal with certain problems in standardized way ● few examples: - limit the number of tags, each of them should be self-contained with no cross-dependencies. - each branch should be named <your-name>/<branch-name>, with master being in production - do not include roles/tasks inside other roles, this creates hard to follow dependencies - NEVER subset the list of hosts inside the role, do it in site.yml. Otherwise debugging roles/hosts will become difficult - think twice before adding new role and esp. groups. As infrastructure grows, it becomes hard to manage and/or creates "dead code/roles" - create reusable roles, by moving common code into *_base roles (i.e. apache2_base).
  • 6. Ugly-hacks reusability ● one of the policies introduced was storing one-off scripts in a separate directory in our automation repo. ● these scripts are usually Ansible playbooks used just for one particular task (i.e. Squeeze->Wheezy migration). Rest is in Python, few of them in bash. ● version-control everything! ● turned out to be very useful, some of them turned out to be useful enough to be rewritten to proper role or a tool
  • 7.
  • 8. Apache2 automation ● available on GitHub and Ansible Galaxy: https://galaxy.ansible.com/list#/roles/940 https://galaxy.ansible.com/list#/roles/941 ● “base” role: - is reused across 8 different production roles we have ATM - contains basic monitoring, logrotation, packages installation, etc… - includes PHP setup in modphp/prefork configuration - PHP disabled functions control - basic security setup - does not include any site-specific stuff ● "site” role: - contains all site specific stuff and dependencies (vhosts, additional packages, etc...) - usually very simple - more than one site role possible, only one base role though ● It is an example of how we make our roles reusable
  • 9. ● automatically setups monitoring basing on inventory and host groups ● implements devops approach - if dev has root on machine, he also has access to all monitoring stuff related to this system ● automatic host dependencies basing on host groups ● provisioning new hosts is no longer so painful ("auto-discovery") ● code reuse! - uses apache2_base to serve interface ● all services configuration is stored as YAML files, and used in templates ● role uses DNS data from YAML hash in order to make monitoring independant of DNS failures Icinga
  • 10. DNS migration ● at the beginning: - dozens of authoritative name servers, each of them having customized configuration, running ~100 zones, all created by hand - changing anything required lot's of work and was error prone - the main reason for that was using DNS for switching between primary/secondary servers/services ● three phases: - slurping configuration into Ansible - normalizing the configuration - improving the setup ● an example of how we can interface Python and Ansible to perform more complex stuff ● Python script which uses Ansible API to fetch normalized zone configuration from each server - results available in a neat hash, with per-host, per-zone keys! - normalization using named-checkconf tool ● results, after parsing, are stored in one, big YAML ● parsed/included by all modules requiring dns data * suboptimal, see WIP section in next slide ● use slurped configuration to re-generate all configs, this time using only the data available to Ansible's ● "push-button" migration, after all recipes were ready :)
  • 11. DNS automation ● all zone transfers are signed, all hosts have individual keys ● checks/markets use dns data directly in templates ● ACLs are tight, and auto-generated ● changing/migrating slaves/masters is easy, NS records are auto-generated ● updates to zones automatically bump serial, while still preserving the YYYYMMDDxx format ● CRM records are auto-generated as well * see next slide about CRM automation ● still WIP - streamline the use DNS data directly in playbooks by creating lookup plugin, no more "chicken and egg problem" - new hosts should be automatically added to DNS - reduce number of DNS servers ;) - auto-generation of reverse zones - opensourcing!
  • 12. Corosync & Pacemaker ● we have ~130 CRM clusters ● setting them up by hand would be "difficult" at best, impossible at worst ● available on Ansible Galaxy: - https://galaxy.ansible.com/list#/roles/956 - https://galaxy.ansible.com/list#/roles/979 ● follows pattern from apache2_base - “base” role suitable for manually set up clusters - "cluster” role provides service upon base, with few reusable snippets and a possibility for more complex configurations ● automatic membership based on ansible inventory (no multicasts!) ● the most difficult part was providing synchronous handlers which will not screw up the cluster at one hand and update the configuration on other ● few simple configurations are provided, like single service-single vip
  • 13. User management automation ● initially we did not have time nor resources to set up full fledged LDAP, ● we needed: - user should be able to log in even during a network outage - removal/adding users, ssh-keys, custom settings, etc.. all had to be supported - it had to be reusable/accessible in other roles (i.e. Icinga/monitoring) - different privileges for dev,production and other environments - UID/GID unification ● turned out to be simpler than we thought - users are managed using few simple tasks and group_vars data. Rest is handled via variables precedence. ● migration/standardization required some effort though
  • 14. ● we are leasing our servers from Hetzner, no direct Layer 2 connectivity ● all tunnel setups are done using Ansible, new server is automatically added to our network ● firewalls need to be tight, set up by Ansible as well - OPS contribute the base firewall, DEVs can open the ports of interest for their application - ferm at it's base - WIP Networking
  • 15. Backups ● based on Bareos, opensource Bacula fork ● new hosts are automatically set up for backup, extending storage space is no longer a problem ● authentication using certificates, PITA without ansible
  • 16. Scaling markets ● we are present on 20 markets (and counting!), each of them grows at it's own pace ● providing configuration for each and every of them would be problematic, having one-size-fits-them-all would be inefficient ● "Market classes" - each size has its own set of params ● scaling markets up/down requires just one change and a playbook run ● done using group variables and inheritance hierarchies
  • 17. Not everything is perfect ● Jinja2 template error messages are "difficult" to say the least. There is no information regarding where is the problem, and i.e. description of it's nature. Often we have to bisect ● templates sometimes grow to huge complexity ● Jinja2 is designed for speed, but with tradeoffs. We *really* miss some Python operators, and creating custom plugins/filters poses some problems ● the same goes with Ansible roles - with more complexity, sometimes it would be really handy to have Python around. Corosync handlers are a good example ● multi-inheritance, problems with 2-headed trees - bit ugly, but we had no other choice ● speed, improved with "pipelining=True" ● some useful functionality requires paid subscription (Ansible Tower) - RESTfull API, usefull if you want to push new application version to productions via i.e. Jenkins - schedules - currently we need to push the changes ourselves
  • 18. Dev,DevOps,Ops ● developers by default have RO access to repo, RW on case-by-case basis ● changes to systems owned by developers are done by developers, OPS only provide the platform and tools ● all non-trivial changes require a Pull Request and a review from Ops ● in order to limit access to most sensitive data (i.e. passwords, certificate keys, etc...), we encrypt mission critical data with Ansible Vault and push it directly to the repo - *strong* encryption - available to Ansible without the need for decryption (password still required though) - all security sensitive stuff can be skipped by developers with "--skip-tags" option to ansible-playbooks
  • 19.
  • 20. Opensource! Opensource! Opensource! ● some of the things we mentioned can be find on our Github account ● we are working on opensourcing more stuff https://github.com/brainly
  • 21. Conclusions ● time needed to deploy new markets dropped considerably ● increased productivity ● better cooperation with developers ● more workpower, Devs are no longer blocked so much, we can push tasks to them ● infrastructure as a code ● versioning ● code-reuse, less copy-pasting
  • 22. We are hiring! http://brainly.co/jobs/