SlideShare une entreprise Scribd logo
1  sur  132
Télécharger pour lire hors ligne
Test driven
                        Infrastructure
                         development

                             Tomas (t0m) Doran
                             <tomas.doran@timgroup.com>
                             @bobtfish
                             https://github.com/bobtfish
                             https://github.com/youdevise
Thursday, 14 March 13
‘Real men’ develop in
                             production!




Thursday, 14 March 13

Repeat again and again. Development cycle SLOOOW.
‘Real men’ develop in
                             production!
                             • Edit / Commit / Push




Thursday, 14 March 13

Repeat again and again. Development cycle SLOOOW.
‘Real men’ develop in
                             production!
                             • Edit / Commit / Push
                             • Update puppetmaster



Thursday, 14 March 13

Repeat again and again. Development cycle SLOOOW.
‘Real men’ develop in
                             production!
                             • Edit / Commit / Push
                             • Update puppetmaster
                             • puppet agent -t


Thursday, 14 March 13

Repeat again and again. Development cycle SLOOOW.
‘Real men’ develop in
                             production!
                             • Edit / Commit / Push
                             • Update puppetmaster
                             • puppet agent -t
                             • Repeat

Thursday, 14 March 13

Repeat again and again. Development cycle SLOOOW.
This is insane!




Thursday, 14 March 13

CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a
day!)
This is insane!


                        • Try it on an 8 person team.



Thursday, 14 March 13

CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a
day!)
This is insane!


                        • Try it on an 8 person team.



Thursday, 14 March 13

CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a
day!)
This is insane!


                        • Try it on an 8 person team.
                        • ‘LOL - I broke puppet’

Thursday, 14 March 13

CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a
day!)
OI!!!




Thursday, 14 March 13
OI!!!
                        OI t0m!!!!




Thursday, 14 March 13
OI!!!
                        OI t0m!!!!
                        You broke
                         puppet!



Thursday, 14 March 13
OI!!!
                          OI t0m!!!!
                         You broke
                           puppet!
                        AARRRGGH!!!


Thursday, 14 March 13
Lets fix this!

                    • First, a glossary:




Thursday, 14 March 13
Lets fix this!

                    • First, a glossary:
                     • mco - mcollective



Thursday, 14 March 13
Lets fix this!

                    • First, a glossary:
                     • mco - mcollective
                     • ENC - External node classifier


Thursday, 14 March 13
We can do better




Thursday, 14 March 13

This at least lets you develop things independently. Everyone can do dev in their own branch
and merge once they have something that doesn’t break _everything_. You can also rebase -i
(squash) all the ARGH PUPPET SYNTAX commits.
We can do better
                        • Branch == environment




Thursday, 14 March 13

This at least lets you develop things independently. Everyone can do dev in their own branch
and merge once they have something that doesn’t break _everything_. You can also rebase -i
(squash) all the ARGH PUPPET SYNTAX commits.
We can do better
                        • Branch == environment
                        • Branch / Commit / Push



Thursday, 14 March 13

This at least lets you develop things independently. Everyone can do dev in their own branch
and merge once they have something that doesn’t break _everything_. You can also rebase -i
(squash) all the ARGH PUPPET SYNTAX commits.
We can do better
                        • Branch == environment
                        • Branch / Commit / Push
                        • mco puppetupdate


Thursday, 14 March 13

This at least lets you develop things independently. Everyone can do dev in their own branch
and merge once they have something that doesn’t break _everything_. You can also rebase -i
(squash) all the ARGH PUPPET SYNTAX commits.
We can do better
                        • Branch == environment
                        • Branch / Commit / Push
                        • mco puppetupdate
                        • puppet agent -t
                         --environment xxx


Thursday, 14 March 13

This at least lets you develop things independently. Everyone can do dev in their own branch
and merge once they have something that doesn’t break _everything_. You can also rebase -i
(squash) all the ARGH PUPPET SYNTAX commits.
Sounds good?

                    •Then you’ll be wanting:
                     •https://github.com/
                        youdevise/puppetupdate




Thursday, 14 March 13

It’s a bit basic, but then I ripped it out of work internal code at 8am ;)
So we fixed it?




Thursday, 14 March 13
So we fixed it?




Thursday, 14 March 13
Refactoring




Thursday, 14 March 13

Sorry Chris, but when you say ‘refactoring’ - it’s not refactoring unless you have tests.
The problem is that you can’t always remember to run the right branch on all the right nodes.
Or rather, how do you even know what all the right nodes are? And if you’re hacking on
custom functions, or anything using exported resource - WOE
Refactoring
                    • We change things to be consistent across
                        codebase:
                        • Why did puppet just delete all the
                          firewall rules on the production
                          database?




Thursday, 14 March 13

Sorry Chris, but when you say ‘refactoring’ - it’s not refactoring unless you have tests.
The problem is that you can’t always remember to run the right branch on all the right nodes.
Or rather, how do you even know what all the right nodes are? And if you’re hacking on
custom functions, or anything using exported resource - WOE
Refactoring
                    • We change things to be consistent across
                        codebase:
                        • Why did puppet just delete all the
                          firewall rules on the production
                          database?
                    • We don’t refactor:
                     • Add bugs all the time due to
                          inconsistency
Thursday, 14 March 13

Sorry Chris, but when you say ‘refactoring’ - it’s not refactoring unless you have tests.
The problem is that you can’t always remember to run the right branch on all the right nodes.
Or rather, how do you even know what all the right nodes are? And if you’re hacking on
custom functions, or anything using exported resource - WOE
Unfortunate reality:

                    • Hard coded IPs in 10 places




Thursday, 14 March 13

So, despite our best efforts, our puppet code was SHIIIIT.
Exported resources IS NOT a good fit for non-trivial things (like generating load balancer
configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
Unfortunate reality:

                    • Hard coded IPs in 10 places
                    • role::oy_lb




Thursday, 14 March 13

So, despite our best efforts, our puppet code was SHIIIIT.
Exported resources IS NOT a good fit for non-trivial things (like generating load balancer
configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
Unfortunate reality:

                    • Hard coded IPs in 10 places
                    • role::oy_lb
                    • hiera data split by domain (colo)



Thursday, 14 March 13

So, despite our best efforts, our puppet code was SHIIIIT.
Exported resources IS NOT a good fit for non-trivial things (like generating load balancer
configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
Unfortunate reality:

                    • Hard coded IPs in 10 places
                    • role::oy_lb
                    • hiera data split by domain (colo)
                    • mco puppet


Thursday, 14 March 13

So, despite our best efforts, our puppet code was SHIIIIT.
Exported resources IS NOT a good fit for non-trivial things (like generating load balancer
configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
Unfortunate reality:

                    • Hard coded IPs in 10 places
                    • role::oy_lb
                    • hiera data split by domain (colo)
                    • mco puppet
                    • 4 weeks per app per environment

Thursday, 14 March 13

So, despite our best efforts, our puppet code was SHIIIIT.
Exported resources IS NOT a good fit for non-trivial things (like generating load balancer
configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
The state of the art




Thursday, 14 March 13
The state of the art
                • It’s certainly in a state




Thursday, 14 March 13

Nobody does automatic runs
Puppet becomes an auditing tool (automatic noop runs + reports)
The state of the art
                • It’s certainly in a state

                • Automatic runs dangerous




Thursday, 14 March 13

Nobody does automatic runs
Puppet becomes an auditing tool (automatic noop runs + reports)
The state of the art
                • It’s certainly in a state

                • Automatic runs dangerous
                • cron --noop runs



Thursday, 14 March 13

Nobody does automatic runs
Puppet becomes an auditing tool (automatic noop runs + reports)
The state of the art
                • It’s certainly in a state

                • Automatic runs dangerous
                • cron --noop runs
                • puppet becomes an auditing system


Thursday, 14 March 13

Nobody does automatic runs
Puppet becomes an auditing tool (automatic noop runs + reports)
The state of the art
                • It’s certainly in a state

                • Automatic runs dangerous
                • cron --noop runs
                • puppet becomes an auditing system
                • This isn’t what I signed up for!

Thursday, 14 March 13

Nobody does automatic runs
Puppet becomes an auditing tool (automatic noop runs + reports)
Business says no!




Thursday, 14 March 13
Business says no!
                    • Launching new products has a long lead
                        time
                        • This is unhelpful if your company is trying
                          to branch out into new markets




Thursday, 14 March 13
Business says no!
                    • Launching new products has a long lead
                        time
                        • This is unhelpful if your company is trying
                          to branch out into new markets
                    • CI / stage environments unlike prod
                     • Issues when new functionality goes live
                     • Developers think you’re incompetent
Thursday, 14 March 13
What is wrong
                        with this picture?




Thursday, 14 March 13

You just don’t know the answer to any of these questions in any reliable way...
But, generally, the answers are NO, YES, NO, NO
What is wrong
                        with this picture?
                        • Did you run it everywhere?




Thursday, 14 March 13

You just don’t know the answer to any of these questions in any reliable way...
But, generally, the answers are NO, YES, NO, NO
What is wrong
                        with this picture?
                        • Did you run it everywhere?
                        • Does it affect anything you’re
                          not expecting?




Thursday, 14 March 13

You just don’t know the answer to any of these questions in any reliable way...
But, generally, the answers are NO, YES, NO, NO
What is wrong
                        with this picture?
                        • Did you run it everywhere?
                        • Does it affect anything you’re
                          not expecting?
                        • Can you rebuild cleanly?


Thursday, 14 March 13

You just don’t know the answer to any of these questions in any reliable way...
But, generally, the answers are NO, YES, NO, NO
What is wrong
                        with this picture?
                        • Did you run it everywhere?
                        • Does it affect anything you’re
                          not expecting?
                        • Can you rebuild cleanly?
                        • Does the code even make things
                          reflect current state?


Thursday, 14 March 13

You just don’t know the answer to any of these questions in any reliable way...
But, generally, the answers are NO, YES, NO, NO
‘We use puppet’




Thursday, 14 March 13

Hint - you don’t!
‘We use puppet’

                        • Means nothing




Thursday, 14 March 13

Hint - you don’t!
‘We use puppet’

                        • Means nothing
                        • State of your system is
                          the sum of all changes




Thursday, 14 March 13

Hint - you don’t!
‘We use puppet’

                        • Means nothing
                        • State of your system is
                          the sum of all changes
                        • How do you know your
                          code can rebuild things?



Thursday, 14 March 13

Hint - you don’t!
It’s all mierda




Thursday, 14 March 13

We need to grow up, and raise the level of the conversation..
It’s all mierda
                    • Development communities are 10
                        years ahead




Thursday, 14 March 13

We need to grow up, and raise the level of the conversation..
It’s all mierda
                    • Development communities are 10
                        years ahead
                    • We don’t integration test
                     • (repeatably)



Thursday, 14 March 13

We need to grow up, and raise the level of the conversation..
It’s all mierda
                    • Development communities are 10
                        years ahead
                    • We don’t integration test
                     • (repeatably)
                    • We can’t build / rebuild
                     • (reliably)
Thursday, 14 March 13

We need to grow up, and raise the level of the conversation..
Infra is hard




Thursday, 14 March 13

Sure - it’s much much harder to get a standalone testable system in infra than it is in
development.
Infra is hard

                    • Infrastructure is inherently more complex




Thursday, 14 March 13

Sure - it’s much much harder to get a standalone testable system in infra than it is in
development.
Infra is hard

                    • Infrastructure is inherently more complex
                    • Less control




Thursday, 14 March 13

Sure - it’s much much harder to get a standalone testable system in infra than it is in
development.
Infra is hard

                    • Infrastructure is inherently more complex
                    • Less control
                    • More moving parts



Thursday, 14 March 13

Sure - it’s much much harder to get a standalone testable system in infra than it is in
development.
Infra is hard

                    • Infrastructure is inherently more complex
                    • Less control
                    • More moving parts
                    • ‘End to end’ testing


Thursday, 14 March 13

Sure - it’s much much harder to get a standalone testable system in infra than it is in
development.
Infra is hard

                    • Infrastructure is inherently more complex
                    • Less control
                    • More moving parts
                    • ‘End to end’ testing
                    • Persistent data

Thursday, 14 March 13

Sure - it’s much much harder to get a standalone testable system in infra than it is in
development.
No excuses:
                        Scientific method




Thursday, 14 March 13

I do not consider this an excuse to abandon sanity.
The solution?




Thursday, 14 March 13
The solution?
                        • Re-provision everything in tests
                         • N.B. Not perfect (but better!)




Thursday, 14 March 13
The solution?
                        • Re-provision everything in tests
                         • N.B. Not perfect (but better!)




Thursday, 14 March 13
The solution?
                        • Re-provision everything in tests
                         • N.B. Not perfect (but better!)

                        • Proper software engineering
                         • Unit and integration tests
                         • Build pipeline + promotion
Thursday, 14 March 13
Openstack

                    • Our tests spinning up 12 machines => VMs




Thursday, 14 March 13

So, we should use openstack, right? As of December, when we looked - 2 networks max,
inflexible. lvs not possible.
Openstack

                    • Our tests spinning up 12 machines => VMs
                    • Openstack going to be awesome, right now:




Thursday, 14 March 13

So, we should use openstack, right? As of December, when we looked - 2 networks max,
inflexible. lvs not possible.
Openstack

                    • Our tests spinning up 12 machines => VMs
                    • Openstack going to be awesome, right now:
                     • Networking sucks



Thursday, 14 March 13

So, we should use openstack, right? As of December, when we looked - 2 networks max,
inflexible. lvs not possible.
Openstack

                    • Our tests spinning up 12 machines => VMs
                    • Openstack going to be awesome, right now:
                     • Networking sucks
                     • Load balancing is a shambles


Thursday, 14 March 13

So, we should use openstack, right? As of December, when we looked - 2 networks max,
inflexible. lvs not possible.
Openstack

                    • Our tests spinning up 12 machines => VMs
                    • Openstack going to be awesome, right now:
                     • Networking sucks
                     • Load balancing is a shambles
                     • lvs / vlans / metal / bonding - nope

Thursday, 14 March 13

So, we should use openstack, right? As of December, when we looked - 2 networks max,
inflexible. lvs not possible.
My desires:




Thursday, 14 March 13
My desires:
                    • Reuse as much code as possible! (e.g. load
                        balancers)




Thursday, 14 March 13
My desires:
                    • Reuse as much code as possible! (e.g. load
                        balancers)
                    • No per colo/environment puppet code




Thursday, 14 March 13
My desires:
                    • Reuse as much code as possible! (e.g. load
                        balancers)
                    • No per colo/environment puppet code
                    • No IPs anywhere




Thursday, 14 March 13
My desires:
                    • Reuse as much code as possible! (e.g. load
                        balancers)
                    • No per colo/environment puppet code
                    • No IPs anywhere
                    • ‘DRY’



Thursday, 14 March 13
My desires:
                    • Reuse as much code as possible! (e.g. load
                        balancers)
                    • No per colo/environment puppet code
                    • No IPs anywhere
                    • ‘DRY’
                    • CI pipeline to promote to production


Thursday, 14 March 13
My desires:
                    • Reuse as much code as possible! (e.g. load
                        balancers)
                    • No per colo/environment puppet code
                    • No IPs anywhere
                    • ‘DRY’
                    • CI pipeline to promote to production
                    • 1 puppet run from provisioned to working

Thursday, 14 March 13
My desires:
                    • Reuse as much code as possible! (e.g. load
                        balancers)
                    • No per colo/environment puppet code
                    • No IPs anywhere
                    • ‘DRY’
                    • CI pipeline to promote to production
                    • 1 puppet run from provisioned to working
                    • Repeatable and testable!
Thursday, 14 March 13
Orc




Thursday, 14 March 13
Orc
                    • Continuous (zero downtime) deployment




Thursday, 14 March 13
Orc
                    • Continuous (zero downtime) deployment
                    • Development / infrastructure application
                        contract




Thursday, 14 March 13
Orc
                    • Continuous (zero downtime) deployment
                    • Development / infrastructure application
                        contract
                    • Model driven


Thursday, 14 March 13
Orc
                    • Continuous (zero downtime) deployment
                    • Development / infrastructure application
                        contract
                    • Model driven
                    • https://github.com/youdevise/orc/

Thursday, 14 March 13
Puppetroll




Thursday, 14 March 13
Puppetroll

                    • Rolls out a consistent sha1 from the
                        puppetmaster to an entire environment




Thursday, 14 March 13
Puppetroll

                    • Rolls out a consistent sha1 from the
                        puppetmaster to an entire environment
                    • Fails if any puppet run fails



Thursday, 14 March 13
Puppetroll

                    • Rolls out a consistent sha1 from the
                        puppetmaster to an entire environment
                    • Fails if any puppet run fails
                    • https://github.com/youdevise/puppetroll


Thursday, 14 March 13
Provisioning tools




Thursday, 14 March 13
Provisioning tools
                    • debootstrap custom gold images




Thursday, 14 March 13
Provisioning tools
                    • debootstrap custom gold images
                    • mcollective ‘computenode’ agent for kvm




Thursday, 14 March 13
Provisioning tools
                    • debootstrap custom gold images
                    • mcollective ‘computenode’ agent for kvm
                     • ‘provision me a machine called X, on
                        networks Y and Z’




Thursday, 14 March 13
Provisioning tools
                    • debootstrap custom gold images
                    • mcollective ‘computenode’ agent for kvm
                     • ‘provision me a machine called X, on
                          networks Y and Z’
                        • Dynamic IP allocation (dnsmasq locally,
                          DDNS for real)


Thursday, 14 March 13
stacks




Thursday, 14 March 13
stacks
                    • Model driven deployment




Thursday, 14 March 13
stacks
                    • Model driven deployment
                    • DSL for describing groups of systems +
                        dependencies




Thursday, 14 March 13
stacks
                    • Model driven deployment
                    • DSL for describing groups of systems +
                        dependencies
                    • rake tasks to provision / test / clean up
                        stack + deps




Thursday, 14 March 13
stacks
                    • Model driven deployment
                    • DSL for describing groups of systems +
                        dependencies
                    • rake tasks to provision / test / clean up
                        stack + deps
                    • Can provision a full environment, run E2E
                        tests, tear it down - in CI.


Thursday, 14 March 13
Thursday, 14 March 13
Thursday, 14 March 13
I want to hack on load
                           balancers



                        = 4 new, independent machines




Thursday, 14 March 13
How it works?




Thursday, 14 March 13
How it works?
                    • DSL creates model of systems




Thursday, 14 March 13
How it works?
                    • DSL creates model of systems
                    • rake task ‘launch’:




Thursday, 14 March 13
How it works?
                    • DSL creates model of systems
                    • rake task ‘launch’:
                     • mco provisions boxes on compute nodes



Thursday, 14 March 13
How it works?
                    • DSL creates model of systems
                    • rake task ‘launch’:
                     • mco provisions boxes on compute nodes
                     • each box runs puppet --waitforcert


Thursday, 14 March 13
How it works?
                    • DSL creates model of systems
                    • rake task ‘launch’:
                     • mco provisions boxes on compute nodes
                     • each box runs puppet --waitforcert
                     • mco signs cert

Thursday, 14 March 13
How it works?
                    • DSL creates model of systems
                    • rake task ‘launch’:
                     • mco provisions boxes on compute nodes
                     • each box runs puppet --waitforcert
                     • mco signs cert
                     • puppet runs for each box
Thursday, 14 March 13
mco computenode




Thursday, 14 March 13
Puppetmaster




Thursday, 14 March 13
Puppetmaster

                    • Uses the same model




Thursday, 14 March 13
Puppetmaster

                    • Uses the same model
                    • Generates an ENC for each node




Thursday, 14 March 13
Puppetmaster

                    • Uses the same model
                    • Generates an ENC for each node
                    • Puppet code:



Thursday, 14 March 13
Puppetmaster

                    • Uses the same model
                    • Generates an ENC for each node
                    • Puppet code:
                     • Just installs things / starts services


Thursday, 14 March 13
Puppetmaster

                    • Uses the same model
                    • Generates an ENC for each node
                    • Puppet code:
                     • Just installs things / starts services
                     • I.E. what it’s good at!

Thursday, 14 March 13
External node classifier




Thursday, 14 March 13
Putting it together




Thursday, 14 March 13

So, what do we have? Well - everything I showed you already...
Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
Putting it together

                    • Still ongoing - live production apps ETA two
                        weeks.




Thursday, 14 March 13

So, what do we have? Well - everything I showed you already...
Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
Putting it together

                    • Still ongoing - live production apps ETA two
                        weeks.
                    • Still haven’t solved re-provisioning problem
                        for live environments!




Thursday, 14 March 13

So, what do we have? Well - everything I showed you already...
Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
Putting it together

                    • Still ongoing - live production apps ETA two
                        weeks.
                    • Still haven’t solved re-provisioning problem
                        for live environments!
                    • Do have repeatable and testable / tested
                        infrastructure building in CI!



Thursday, 14 March 13

So, what do we have? Well - everything I showed you already...
Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
Thursday, 14 March 13
Thursday, 14 March 13

The top table is our test overview - we have two types of tests, those which are for a specific
machine (i.e. a VM) and those which are for a virtual service (backed by multiple machines)
‘behaves like’ is an rspec thing we haven’t overridden.
For each machine, we test that it’s pingable, then run every nrpe (nagios) agent and check
In the (near) future?




Thursday, 14 March 13
In the (near) future?
                    • Live application stack in production




Thursday, 14 March 13
In the (near) future?
                    • Live application stack in production
                    • Automated ‘promotion’ of good changes to
                        production




Thursday, 14 March 13
In the (near) future?
                    • Live application stack in production
                    • Automated ‘promotion’ of good changes to
                        production
                    • Integrated environment support for dev
                        stacks on dev branches/environments




Thursday, 14 March 13
In the (near) future?
                    • Live application stack in production
                    • Automated ‘promotion’ of good changes to
                        production
                    • Integrated environment support for dev
                        stacks on dev branches/environments
                    • Open source all the things!

Thursday, 14 March 13
Thanks!




Thursday, 14 March 13
Thanks!
                    • puppet is an awesome tool.
                     • It doesn’t solve higher level system
                          modeling problems
                        • It shouldn’t try to!



Thursday, 14 March 13
Thanks!
                    • puppet is an awesome tool.
                     • It doesn’t solve higher level system
                          modeling problems
                        • It shouldn’t try to!

                    • sysadmins need to level up
                     • It’s not done till you can test it still works
Thursday, 14 March 13
Photo Credits
                        •   Escher's "Relativity" in LEGO - Andrew Lipson (http://www.andrewlipson.com)
                        •   Manure - Flickr - chesbayprogram
                        •   Provisions - Flickr - quinn.anya
                        •   Stacked - Flickr - andrewrennie
                        •   Dilbert - Flickr - osde-info
                        •   Stacking wood - fickr - arthuserea
                        •   Square wheels - Flickr - vrogy
                        •   Puppets - Flickr - SkipSteuart
                        •   Light bulb - Flickr - bazik
                        •   This-is-not-art - Wikimedia commons - Loran Davis
                        •   Danger of death - Flickr - zigazou76
                        •   Bob the Builder - Flickr - jamesclay
                        •   Swiss roll - Flickr - add1sun
                        •   Orc - Flickr - photo_munki
                        •   Danger! Danger - Flickr - donsolo
                        •   Cow of the future - Flickr - thewamphyri
                        •   SCIENCE - Flickr - chasblackman




Thursday, 14 March 13
Links!

                    • http://github.com/youdevise
                    • http://github.com/bobtfish
                    • https://devblog.timgroup.com/
                    • (Yes, we are hiring)

Thursday, 14 March 13

Contenu connexe

En vedette

Dung Cho Den Ngay Mai
Dung Cho Den Ngay MaiDung Cho Den Ngay Mai
Dung Cho Den Ngay Maithuyvu75
 
Google analytics 2-dagers kurs mai 2011
Google analytics 2-dagers kurs mai 2011Google analytics 2-dagers kurs mai 2011
Google analytics 2-dagers kurs mai 2011Kenneth Eriksen
 
iDRUG - intelligent drug discovery
iDRUG - intelligent drug discoveryiDRUG - intelligent drug discovery
iDRUG - intelligent drug discoverySean Ekins
 
Impacto de las tics en las practicas educativas pedi00190 tabla1 3
Impacto de las tics en las practicas educativas pedi00190 tabla1 3Impacto de las tics en las practicas educativas pedi00190 tabla1 3
Impacto de las tics en las practicas educativas pedi00190 tabla1 3german8329
 
CISOs are from Mars, CIOs are from Venus
CISOs are from Mars, CIOs are from VenusCISOs are from Mars, CIOs are from Venus
CISOs are from Mars, CIOs are from VenusBarry Caplin
 
orchid island 蘭嶼
orchid island 蘭嶼orchid island 蘭嶼
orchid island 蘭嶼kkjjkevin03
 
8 ways to cheer up during a hard day at work
8 ways to cheer up during a hard day at work8 ways to cheer up during a hard day at work
8 ways to cheer up during a hard day at workAnkur Tandon
 
Creating Value in Health through Big Data
Creating Value in Health through Big DataCreating Value in Health through Big Data
Creating Value in Health through Big DataBooz Allen Hamilton
 
Helsinki book launch jenn lim delivering happiness_45_16.9
Helsinki book launch jenn lim delivering happiness_45_16.9Helsinki book launch jenn lim delivering happiness_45_16.9
Helsinki book launch jenn lim delivering happiness_45_16.9Delivering Happiness
 
Resume milind patil
Resume milind patilResume milind patil
Resume milind patilMilind Patil
 
Atlassian User Group, AUG Wiesbaden, 25 October 2012
Atlassian User Group, AUG Wiesbaden, 25 October 2012Atlassian User Group, AUG Wiesbaden, 25 October 2012
Atlassian User Group, AUG Wiesbaden, 25 October 2012Sarah Maddox
 
Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...
Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...
Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...Bjoern Knafla
 

En vedette (17)

Dung Cho Den Ngay Mai
Dung Cho Den Ngay MaiDung Cho Den Ngay Mai
Dung Cho Den Ngay Mai
 
Google analytics 2-dagers kurs mai 2011
Google analytics 2-dagers kurs mai 2011Google analytics 2-dagers kurs mai 2011
Google analytics 2-dagers kurs mai 2011
 
iDRUG - intelligent drug discovery
iDRUG - intelligent drug discoveryiDRUG - intelligent drug discovery
iDRUG - intelligent drug discovery
 
Presentation1
Presentation1Presentation1
Presentation1
 
Impacto de las tics en las practicas educativas pedi00190 tabla1 3
Impacto de las tics en las practicas educativas pedi00190 tabla1 3Impacto de las tics en las practicas educativas pedi00190 tabla1 3
Impacto de las tics en las practicas educativas pedi00190 tabla1 3
 
CISOs are from Mars, CIOs are from Venus
CISOs are from Mars, CIOs are from VenusCISOs are from Mars, CIOs are from Venus
CISOs are from Mars, CIOs are from Venus
 
orchid island 蘭嶼
orchid island 蘭嶼orchid island 蘭嶼
orchid island 蘭嶼
 
8 ways to cheer up during a hard day at work
8 ways to cheer up during a hard day at work8 ways to cheer up during a hard day at work
8 ways to cheer up during a hard day at work
 
6th lesson
6th lesson6th lesson
6th lesson
 
MySQL
MySQLMySQL
MySQL
 
La capa de ozono
La capa de ozonoLa capa de ozono
La capa de ozono
 
Creating Value in Health through Big Data
Creating Value in Health through Big DataCreating Value in Health through Big Data
Creating Value in Health through Big Data
 
Helsinki book launch jenn lim delivering happiness_45_16.9
Helsinki book launch jenn lim delivering happiness_45_16.9Helsinki book launch jenn lim delivering happiness_45_16.9
Helsinki book launch jenn lim delivering happiness_45_16.9
 
Resume milind patil
Resume milind patilResume milind patil
Resume milind patil
 
Atlassian User Group, AUG Wiesbaden, 25 October 2012
Atlassian User Group, AUG Wiesbaden, 25 October 2012Atlassian User Group, AUG Wiesbaden, 25 October 2012
Atlassian User Group, AUG Wiesbaden, 25 October 2012
 
Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...
Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...
Parallelizing a Real-Time Steering Simulation for Computer Games with OpenMP ...
 
BGP Loop Prevention
BGP Loop Prevention BGP Loop Prevention
BGP Loop Prevention
 

Similaire à Test driven infrastructure development

Structuring apps in Scala
Structuring apps in ScalaStructuring apps in Scala
Structuring apps in ScalaPhil Calçado
 
Writing testable code
Writing testable codeWriting testable code
Writing testable codeAlvaro Videla
 
Node.js Patterns and Opinions
Node.js Patterns and OpinionsNode.js Patterns and Opinions
Node.js Patterns and OpinionsIsaacSchlueter
 
Native Javascript apps with PhoneGap
Native Javascript apps with PhoneGapNative Javascript apps with PhoneGap
Native Javascript apps with PhoneGapMartin de Keijzer
 
Puppet @ Nedap
Puppet @ NedapPuppet @ Nedap
Puppet @ NedapPuppet
 
Its all about the Feedback loop
Its all about the Feedback loopIts all about the Feedback loop
Its all about the Feedback loopElad Sofer
 
Reggefiber glasvezel presentatie
Reggefiber glasvezel presentatieReggefiber glasvezel presentatie
Reggefiber glasvezel presentatieVincent Everts
 
Speed geek presentation
Speed geek presentationSpeed geek presentation
Speed geek presentationAaron Maurer
 
BlogWest: Blog Content Strategies
BlogWest: Blog Content StrategiesBlogWest: Blog Content Strategies
BlogWest: Blog Content StrategiesIdris Fashan
 
Lessons I Learned While Scaling to 5000 Puppet Agents
Lessons I Learned While Scaling to 5000 Puppet AgentsLessons I Learned While Scaling to 5000 Puppet Agents
Lessons I Learned While Scaling to 5000 Puppet AgentsPuppet
 
TDD with LEGO at SDEC13
TDD with LEGO at SDEC13TDD with LEGO at SDEC13
TDD with LEGO at SDEC13BillyGarnet
 
The Changing Nature of Brand Experience in Higher Education
The Changing Nature of Brand Experience in Higher EducationThe Changing Nature of Brand Experience in Higher Education
The Changing Nature of Brand Experience in Higher EducationJoel Pattison
 
Five gems
Five gemsFive gems
Five gemsjeremyw
 
When your code is nearly old enough to vote
When your code is nearly old enough to voteWhen your code is nearly old enough to vote
When your code is nearly old enough to votedreamwidth
 

Similaire à Test driven infrastructure development (20)

Structuring apps in Scala
Structuring apps in ScalaStructuring apps in Scala
Structuring apps in Scala
 
Writing testable code
Writing testable codeWriting testable code
Writing testable code
 
Writing the docs
Writing the docsWriting the docs
Writing the docs
 
Future layouts
Future layoutsFuture layouts
Future layouts
 
Node.js Patterns and Opinions
Node.js Patterns and OpinionsNode.js Patterns and Opinions
Node.js Patterns and Opinions
 
Native Javascript apps with PhoneGap
Native Javascript apps with PhoneGapNative Javascript apps with PhoneGap
Native Javascript apps with PhoneGap
 
Puppet @ Nedap
Puppet @ NedapPuppet @ Nedap
Puppet @ Nedap
 
Its all about the Feedback loop
Its all about the Feedback loopIts all about the Feedback loop
Its all about the Feedback loop
 
Reggefiber glasvezel presentatie
Reggefiber glasvezel presentatieReggefiber glasvezel presentatie
Reggefiber glasvezel presentatie
 
Speed geek presentation
Speed geek presentationSpeed geek presentation
Speed geek presentation
 
BlogWest: Blog Content Strategies
BlogWest: Blog Content StrategiesBlogWest: Blog Content Strategies
BlogWest: Blog Content Strategies
 
Lessons I Learned While Scaling to 5000 Puppet Agents
Lessons I Learned While Scaling to 5000 Puppet AgentsLessons I Learned While Scaling to 5000 Puppet Agents
Lessons I Learned While Scaling to 5000 Puppet Agents
 
Fed2013_Managing Workplace Productivity
Fed2013_Managing Workplace ProductivityFed2013_Managing Workplace Productivity
Fed2013_Managing Workplace Productivity
 
TDD with LEGO at SDEC13
TDD with LEGO at SDEC13TDD with LEGO at SDEC13
TDD with LEGO at SDEC13
 
The Changing Nature of Brand Experience in Higher Education
The Changing Nature of Brand Experience in Higher EducationThe Changing Nature of Brand Experience in Higher Education
The Changing Nature of Brand Experience in Higher Education
 
Drupal without coding
Drupal without codingDrupal without coding
Drupal without coding
 
Five gems
Five gemsFive gems
Five gems
 
When Tdd Goes Awry
When Tdd Goes AwryWhen Tdd Goes Awry
When Tdd Goes Awry
 
The Testable Web
The Testable WebThe Testable Web
The Testable Web
 
When your code is nearly old enough to vote
When your code is nearly old enough to voteWhen your code is nearly old enough to vote
When your code is nearly old enough to vote
 

Plus de Tomas Doran

Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data storesTomas Doran
 
Dockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsDockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsTomas Doran
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Tomas Doran
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflowTomas Doran
 
Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerTomas Doran
 
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsChasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsTomas Doran
 
Deploying puppet code at light speed
Deploying puppet code at light speedDeploying puppet code at light speed
Deploying puppet code at light speedTomas Doran
 
Thinking through puppet code layout
Thinking through puppet code layoutThinking through puppet code layout
Thinking through puppet code layoutTomas Doran
 
Docker puppetcamp london 2013
Docker puppetcamp london 2013Docker puppetcamp london 2013
Docker puppetcamp london 2013Tomas Doran
 
"The worst code I ever wrote"
"The worst code I ever wrote""The worst code I ever wrote"
"The worst code I ever wrote"Tomas Doran
 
Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Tomas Doran
 
London devops - orc
London devops - orcLondon devops - orc
London devops - orcTomas Doran
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012Tomas Doran
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testingTomas Doran
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testingTomas Doran
 
Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Tomas Doran
 
Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkTomas Doran
 
Cooking a rabbit pie
Cooking a rabbit pieCooking a rabbit pie
Cooking a rabbit pieTomas Doran
 

Plus de Tomas Doran (20)

Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
 
Dockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsDockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internals
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflow
 
Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for Docker
 
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsChasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
 
Deploying puppet code at light speed
Deploying puppet code at light speedDeploying puppet code at light speed
Deploying puppet code at light speed
 
Thinking through puppet code layout
Thinking through puppet code layoutThinking through puppet code layout
Thinking through puppet code layout
 
Docker puppetcamp london 2013
Docker puppetcamp london 2013Docker puppetcamp london 2013
Docker puppetcamp london 2013
 
"The worst code I ever wrote"
"The worst code I ever wrote""The worst code I ever wrote"
"The worst code I ever wrote"
 
Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)
 
London devops - orc
London devops - orcLondon devops - orc
London devops - orc
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
 
Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!
 
Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new framework
 
Zero mq logs
Zero mq logsZero mq logs
Zero mq logs
 
Cooking a rabbit pie
Cooking a rabbit pieCooking a rabbit pie
Cooking a rabbit pie
 

Test driven infrastructure development

  • 1. Test driven Infrastructure development Tomas (t0m) Doran <tomas.doran@timgroup.com> @bobtfish https://github.com/bobtfish https://github.com/youdevise Thursday, 14 March 13
  • 2. ‘Real men’ develop in production! Thursday, 14 March 13 Repeat again and again. Development cycle SLOOOW.
  • 3. ‘Real men’ develop in production! • Edit / Commit / Push Thursday, 14 March 13 Repeat again and again. Development cycle SLOOOW.
  • 4. ‘Real men’ develop in production! • Edit / Commit / Push • Update puppetmaster Thursday, 14 March 13 Repeat again and again. Development cycle SLOOOW.
  • 5. ‘Real men’ develop in production! • Edit / Commit / Push • Update puppetmaster • puppet agent -t Thursday, 14 March 13 Repeat again and again. Development cycle SLOOOW.
  • 6. ‘Real men’ develop in production! • Edit / Commit / Push • Update puppetmaster • puppet agent -t • Repeat Thursday, 14 March 13 Repeat again and again. Development cycle SLOOOW.
  • 7. This is insane! Thursday, 14 March 13 CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a day!)
  • 8. This is insane! • Try it on an 8 person team. Thursday, 14 March 13 CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a day!)
  • 9. This is insane! • Try it on an 8 person team. Thursday, 14 March 13 CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a day!)
  • 10. This is insane! • Try it on an 8 person team. • ‘LOL - I broke puppet’ Thursday, 14 March 13 CHOAS and FAIL result when you break each other. Or, MORE likely (this happens twice a day!)
  • 12. OI!!! OI t0m!!!! Thursday, 14 March 13
  • 13. OI!!! OI t0m!!!! You broke puppet! Thursday, 14 March 13
  • 14. OI!!! OI t0m!!!! You broke puppet! AARRRGGH!!! Thursday, 14 March 13
  • 15. Lets fix this! • First, a glossary: Thursday, 14 March 13
  • 16. Lets fix this! • First, a glossary: • mco - mcollective Thursday, 14 March 13
  • 17. Lets fix this! • First, a glossary: • mco - mcollective • ENC - External node classifier Thursday, 14 March 13
  • 18. We can do better Thursday, 14 March 13 This at least lets you develop things independently. Everyone can do dev in their own branch and merge once they have something that doesn’t break _everything_. You can also rebase -i (squash) all the ARGH PUPPET SYNTAX commits.
  • 19. We can do better • Branch == environment Thursday, 14 March 13 This at least lets you develop things independently. Everyone can do dev in their own branch and merge once they have something that doesn’t break _everything_. You can also rebase -i (squash) all the ARGH PUPPET SYNTAX commits.
  • 20. We can do better • Branch == environment • Branch / Commit / Push Thursday, 14 March 13 This at least lets you develop things independently. Everyone can do dev in their own branch and merge once they have something that doesn’t break _everything_. You can also rebase -i (squash) all the ARGH PUPPET SYNTAX commits.
  • 21. We can do better • Branch == environment • Branch / Commit / Push • mco puppetupdate Thursday, 14 March 13 This at least lets you develop things independently. Everyone can do dev in their own branch and merge once they have something that doesn’t break _everything_. You can also rebase -i (squash) all the ARGH PUPPET SYNTAX commits.
  • 22. We can do better • Branch == environment • Branch / Commit / Push • mco puppetupdate • puppet agent -t --environment xxx Thursday, 14 March 13 This at least lets you develop things independently. Everyone can do dev in their own branch and merge once they have something that doesn’t break _everything_. You can also rebase -i (squash) all the ARGH PUPPET SYNTAX commits.
  • 23. Sounds good? •Then you’ll be wanting: •https://github.com/ youdevise/puppetupdate Thursday, 14 March 13 It’s a bit basic, but then I ripped it out of work internal code at 8am ;)
  • 24. So we fixed it? Thursday, 14 March 13
  • 25. So we fixed it? Thursday, 14 March 13
  • 26. Refactoring Thursday, 14 March 13 Sorry Chris, but when you say ‘refactoring’ - it’s not refactoring unless you have tests. The problem is that you can’t always remember to run the right branch on all the right nodes. Or rather, how do you even know what all the right nodes are? And if you’re hacking on custom functions, or anything using exported resource - WOE
  • 27. Refactoring • We change things to be consistent across codebase: • Why did puppet just delete all the firewall rules on the production database? Thursday, 14 March 13 Sorry Chris, but when you say ‘refactoring’ - it’s not refactoring unless you have tests. The problem is that you can’t always remember to run the right branch on all the right nodes. Or rather, how do you even know what all the right nodes are? And if you’re hacking on custom functions, or anything using exported resource - WOE
  • 28. Refactoring • We change things to be consistent across codebase: • Why did puppet just delete all the firewall rules on the production database? • We don’t refactor: • Add bugs all the time due to inconsistency Thursday, 14 March 13 Sorry Chris, but when you say ‘refactoring’ - it’s not refactoring unless you have tests. The problem is that you can’t always remember to run the right branch on all the right nodes. Or rather, how do you even know what all the right nodes are? And if you’re hacking on custom functions, or anything using exported resource - WOE
  • 29. Unfortunate reality: • Hard coded IPs in 10 places Thursday, 14 March 13 So, despite our best efforts, our puppet code was SHIIIIT. Exported resources IS NOT a good fit for non-trivial things (like generating load balancer configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
  • 30. Unfortunate reality: • Hard coded IPs in 10 places • role::oy_lb Thursday, 14 March 13 So, despite our best efforts, our puppet code was SHIIIIT. Exported resources IS NOT a good fit for non-trivial things (like generating load balancer configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
  • 31. Unfortunate reality: • Hard coded IPs in 10 places • role::oy_lb • hiera data split by domain (colo) Thursday, 14 March 13 So, despite our best efforts, our puppet code was SHIIIIT. Exported resources IS NOT a good fit for non-trivial things (like generating load balancer configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
  • 32. Unfortunate reality: • Hard coded IPs in 10 places • role::oy_lb • hiera data split by domain (colo) • mco puppet Thursday, 14 March 13 So, despite our best efforts, our puppet code was SHIIIIT. Exported resources IS NOT a good fit for non-trivial things (like generating load balancer configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
  • 33. Unfortunate reality: • Hard coded IPs in 10 places • role::oy_lb • hiera data split by domain (colo) • mco puppet • 4 weeks per app per environment Thursday, 14 March 13 So, despite our best efforts, our puppet code was SHIIIIT. Exported resources IS NOT a good fit for non-trivial things (like generating load balancer configs). Ergo lots of hard coded IPs in multiple places. Ergo puppet code per site.
  • 34. The state of the art Thursday, 14 March 13
  • 35. The state of the art • It’s certainly in a state Thursday, 14 March 13 Nobody does automatic runs Puppet becomes an auditing tool (automatic noop runs + reports)
  • 36. The state of the art • It’s certainly in a state • Automatic runs dangerous Thursday, 14 March 13 Nobody does automatic runs Puppet becomes an auditing tool (automatic noop runs + reports)
  • 37. The state of the art • It’s certainly in a state • Automatic runs dangerous • cron --noop runs Thursday, 14 March 13 Nobody does automatic runs Puppet becomes an auditing tool (automatic noop runs + reports)
  • 38. The state of the art • It’s certainly in a state • Automatic runs dangerous • cron --noop runs • puppet becomes an auditing system Thursday, 14 March 13 Nobody does automatic runs Puppet becomes an auditing tool (automatic noop runs + reports)
  • 39. The state of the art • It’s certainly in a state • Automatic runs dangerous • cron --noop runs • puppet becomes an auditing system • This isn’t what I signed up for! Thursday, 14 March 13 Nobody does automatic runs Puppet becomes an auditing tool (automatic noop runs + reports)
  • 41. Business says no! • Launching new products has a long lead time • This is unhelpful if your company is trying to branch out into new markets Thursday, 14 March 13
  • 42. Business says no! • Launching new products has a long lead time • This is unhelpful if your company is trying to branch out into new markets • CI / stage environments unlike prod • Issues when new functionality goes live • Developers think you’re incompetent Thursday, 14 March 13
  • 43. What is wrong with this picture? Thursday, 14 March 13 You just don’t know the answer to any of these questions in any reliable way... But, generally, the answers are NO, YES, NO, NO
  • 44. What is wrong with this picture? • Did you run it everywhere? Thursday, 14 March 13 You just don’t know the answer to any of these questions in any reliable way... But, generally, the answers are NO, YES, NO, NO
  • 45. What is wrong with this picture? • Did you run it everywhere? • Does it affect anything you’re not expecting? Thursday, 14 March 13 You just don’t know the answer to any of these questions in any reliable way... But, generally, the answers are NO, YES, NO, NO
  • 46. What is wrong with this picture? • Did you run it everywhere? • Does it affect anything you’re not expecting? • Can you rebuild cleanly? Thursday, 14 March 13 You just don’t know the answer to any of these questions in any reliable way... But, generally, the answers are NO, YES, NO, NO
  • 47. What is wrong with this picture? • Did you run it everywhere? • Does it affect anything you’re not expecting? • Can you rebuild cleanly? • Does the code even make things reflect current state? Thursday, 14 March 13 You just don’t know the answer to any of these questions in any reliable way... But, generally, the answers are NO, YES, NO, NO
  • 48. ‘We use puppet’ Thursday, 14 March 13 Hint - you don’t!
  • 49. ‘We use puppet’ • Means nothing Thursday, 14 March 13 Hint - you don’t!
  • 50. ‘We use puppet’ • Means nothing • State of your system is the sum of all changes Thursday, 14 March 13 Hint - you don’t!
  • 51. ‘We use puppet’ • Means nothing • State of your system is the sum of all changes • How do you know your code can rebuild things? Thursday, 14 March 13 Hint - you don’t!
  • 52. It’s all mierda Thursday, 14 March 13 We need to grow up, and raise the level of the conversation..
  • 53. It’s all mierda • Development communities are 10 years ahead Thursday, 14 March 13 We need to grow up, and raise the level of the conversation..
  • 54. It’s all mierda • Development communities are 10 years ahead • We don’t integration test • (repeatably) Thursday, 14 March 13 We need to grow up, and raise the level of the conversation..
  • 55. It’s all mierda • Development communities are 10 years ahead • We don’t integration test • (repeatably) • We can’t build / rebuild • (reliably) Thursday, 14 March 13 We need to grow up, and raise the level of the conversation..
  • 56. Infra is hard Thursday, 14 March 13 Sure - it’s much much harder to get a standalone testable system in infra than it is in development.
  • 57. Infra is hard • Infrastructure is inherently more complex Thursday, 14 March 13 Sure - it’s much much harder to get a standalone testable system in infra than it is in development.
  • 58. Infra is hard • Infrastructure is inherently more complex • Less control Thursday, 14 March 13 Sure - it’s much much harder to get a standalone testable system in infra than it is in development.
  • 59. Infra is hard • Infrastructure is inherently more complex • Less control • More moving parts Thursday, 14 March 13 Sure - it’s much much harder to get a standalone testable system in infra than it is in development.
  • 60. Infra is hard • Infrastructure is inherently more complex • Less control • More moving parts • ‘End to end’ testing Thursday, 14 March 13 Sure - it’s much much harder to get a standalone testable system in infra than it is in development.
  • 61. Infra is hard • Infrastructure is inherently more complex • Less control • More moving parts • ‘End to end’ testing • Persistent data Thursday, 14 March 13 Sure - it’s much much harder to get a standalone testable system in infra than it is in development.
  • 62. No excuses: Scientific method Thursday, 14 March 13 I do not consider this an excuse to abandon sanity.
  • 64. The solution? • Re-provision everything in tests • N.B. Not perfect (but better!) Thursday, 14 March 13
  • 65. The solution? • Re-provision everything in tests • N.B. Not perfect (but better!) Thursday, 14 March 13
  • 66. The solution? • Re-provision everything in tests • N.B. Not perfect (but better!) • Proper software engineering • Unit and integration tests • Build pipeline + promotion Thursday, 14 March 13
  • 67. Openstack • Our tests spinning up 12 machines => VMs Thursday, 14 March 13 So, we should use openstack, right? As of December, when we looked - 2 networks max, inflexible. lvs not possible.
  • 68. Openstack • Our tests spinning up 12 machines => VMs • Openstack going to be awesome, right now: Thursday, 14 March 13 So, we should use openstack, right? As of December, when we looked - 2 networks max, inflexible. lvs not possible.
  • 69. Openstack • Our tests spinning up 12 machines => VMs • Openstack going to be awesome, right now: • Networking sucks Thursday, 14 March 13 So, we should use openstack, right? As of December, when we looked - 2 networks max, inflexible. lvs not possible.
  • 70. Openstack • Our tests spinning up 12 machines => VMs • Openstack going to be awesome, right now: • Networking sucks • Load balancing is a shambles Thursday, 14 March 13 So, we should use openstack, right? As of December, when we looked - 2 networks max, inflexible. lvs not possible.
  • 71. Openstack • Our tests spinning up 12 machines => VMs • Openstack going to be awesome, right now: • Networking sucks • Load balancing is a shambles • lvs / vlans / metal / bonding - nope Thursday, 14 March 13 So, we should use openstack, right? As of December, when we looked - 2 networks max, inflexible. lvs not possible.
  • 73. My desires: • Reuse as much code as possible! (e.g. load balancers) Thursday, 14 March 13
  • 74. My desires: • Reuse as much code as possible! (e.g. load balancers) • No per colo/environment puppet code Thursday, 14 March 13
  • 75. My desires: • Reuse as much code as possible! (e.g. load balancers) • No per colo/environment puppet code • No IPs anywhere Thursday, 14 March 13
  • 76. My desires: • Reuse as much code as possible! (e.g. load balancers) • No per colo/environment puppet code • No IPs anywhere • ‘DRY’ Thursday, 14 March 13
  • 77. My desires: • Reuse as much code as possible! (e.g. load balancers) • No per colo/environment puppet code • No IPs anywhere • ‘DRY’ • CI pipeline to promote to production Thursday, 14 March 13
  • 78. My desires: • Reuse as much code as possible! (e.g. load balancers) • No per colo/environment puppet code • No IPs anywhere • ‘DRY’ • CI pipeline to promote to production • 1 puppet run from provisioned to working Thursday, 14 March 13
  • 79. My desires: • Reuse as much code as possible! (e.g. load balancers) • No per colo/environment puppet code • No IPs anywhere • ‘DRY’ • CI pipeline to promote to production • 1 puppet run from provisioned to working • Repeatable and testable! Thursday, 14 March 13
  • 81. Orc • Continuous (zero downtime) deployment Thursday, 14 March 13
  • 82. Orc • Continuous (zero downtime) deployment • Development / infrastructure application contract Thursday, 14 March 13
  • 83. Orc • Continuous (zero downtime) deployment • Development / infrastructure application contract • Model driven Thursday, 14 March 13
  • 84. Orc • Continuous (zero downtime) deployment • Development / infrastructure application contract • Model driven • https://github.com/youdevise/orc/ Thursday, 14 March 13
  • 86. Puppetroll • Rolls out a consistent sha1 from the puppetmaster to an entire environment Thursday, 14 March 13
  • 87. Puppetroll • Rolls out a consistent sha1 from the puppetmaster to an entire environment • Fails if any puppet run fails Thursday, 14 March 13
  • 88. Puppetroll • Rolls out a consistent sha1 from the puppetmaster to an entire environment • Fails if any puppet run fails • https://github.com/youdevise/puppetroll Thursday, 14 March 13
  • 90. Provisioning tools • debootstrap custom gold images Thursday, 14 March 13
  • 91. Provisioning tools • debootstrap custom gold images • mcollective ‘computenode’ agent for kvm Thursday, 14 March 13
  • 92. Provisioning tools • debootstrap custom gold images • mcollective ‘computenode’ agent for kvm • ‘provision me a machine called X, on networks Y and Z’ Thursday, 14 March 13
  • 93. Provisioning tools • debootstrap custom gold images • mcollective ‘computenode’ agent for kvm • ‘provision me a machine called X, on networks Y and Z’ • Dynamic IP allocation (dnsmasq locally, DDNS for real) Thursday, 14 March 13
  • 95. stacks • Model driven deployment Thursday, 14 March 13
  • 96. stacks • Model driven deployment • DSL for describing groups of systems + dependencies Thursday, 14 March 13
  • 97. stacks • Model driven deployment • DSL for describing groups of systems + dependencies • rake tasks to provision / test / clean up stack + deps Thursday, 14 March 13
  • 98. stacks • Model driven deployment • DSL for describing groups of systems + dependencies • rake tasks to provision / test / clean up stack + deps • Can provision a full environment, run E2E tests, tear it down - in CI. Thursday, 14 March 13
  • 101. I want to hack on load balancers = 4 new, independent machines Thursday, 14 March 13
  • 102. How it works? Thursday, 14 March 13
  • 103. How it works? • DSL creates model of systems Thursday, 14 March 13
  • 104. How it works? • DSL creates model of systems • rake task ‘launch’: Thursday, 14 March 13
  • 105. How it works? • DSL creates model of systems • rake task ‘launch’: • mco provisions boxes on compute nodes Thursday, 14 March 13
  • 106. How it works? • DSL creates model of systems • rake task ‘launch’: • mco provisions boxes on compute nodes • each box runs puppet --waitforcert Thursday, 14 March 13
  • 107. How it works? • DSL creates model of systems • rake task ‘launch’: • mco provisions boxes on compute nodes • each box runs puppet --waitforcert • mco signs cert Thursday, 14 March 13
  • 108. How it works? • DSL creates model of systems • rake task ‘launch’: • mco provisions boxes on compute nodes • each box runs puppet --waitforcert • mco signs cert • puppet runs for each box Thursday, 14 March 13
  • 111. Puppetmaster • Uses the same model Thursday, 14 March 13
  • 112. Puppetmaster • Uses the same model • Generates an ENC for each node Thursday, 14 March 13
  • 113. Puppetmaster • Uses the same model • Generates an ENC for each node • Puppet code: Thursday, 14 March 13
  • 114. Puppetmaster • Uses the same model • Generates an ENC for each node • Puppet code: • Just installs things / starts services Thursday, 14 March 13
  • 115. Puppetmaster • Uses the same model • Generates an ENC for each node • Puppet code: • Just installs things / starts services • I.E. what it’s good at! Thursday, 14 March 13
  • 117. Putting it together Thursday, 14 March 13 So, what do we have? Well - everything I showed you already... Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
  • 118. Putting it together • Still ongoing - live production apps ETA two weeks. Thursday, 14 March 13 So, what do we have? Well - everything I showed you already... Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
  • 119. Putting it together • Still ongoing - live production apps ETA two weeks. • Still haven’t solved re-provisioning problem for live environments! Thursday, 14 March 13 So, what do we have? Well - everything I showed you already... Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
  • 120. Putting it together • Still ongoing - live production apps ETA two weeks. • Still haven’t solved re-provisioning problem for live environments! • Do have repeatable and testable / tested infrastructure building in CI! Thursday, 14 March 13 So, what do we have? Well - everything I showed you already... Building proxy server layer (by refactoring puppet code) right now. Databases to follow!
  • 122. Thursday, 14 March 13 The top table is our test overview - we have two types of tests, those which are for a specific machine (i.e. a VM) and those which are for a virtual service (backed by multiple machines) ‘behaves like’ is an rspec thing we haven’t overridden. For each machine, we test that it’s pingable, then run every nrpe (nagios) agent and check
  • 123. In the (near) future? Thursday, 14 March 13
  • 124. In the (near) future? • Live application stack in production Thursday, 14 March 13
  • 125. In the (near) future? • Live application stack in production • Automated ‘promotion’ of good changes to production Thursday, 14 March 13
  • 126. In the (near) future? • Live application stack in production • Automated ‘promotion’ of good changes to production • Integrated environment support for dev stacks on dev branches/environments Thursday, 14 March 13
  • 127. In the (near) future? • Live application stack in production • Automated ‘promotion’ of good changes to production • Integrated environment support for dev stacks on dev branches/environments • Open source all the things! Thursday, 14 March 13
  • 129. Thanks! • puppet is an awesome tool. • It doesn’t solve higher level system modeling problems • It shouldn’t try to! Thursday, 14 March 13
  • 130. Thanks! • puppet is an awesome tool. • It doesn’t solve higher level system modeling problems • It shouldn’t try to! • sysadmins need to level up • It’s not done till you can test it still works Thursday, 14 March 13
  • 131. Photo Credits • Escher's "Relativity" in LEGO - Andrew Lipson (http://www.andrewlipson.com) • Manure - Flickr - chesbayprogram • Provisions - Flickr - quinn.anya • Stacked - Flickr - andrewrennie • Dilbert - Flickr - osde-info • Stacking wood - fickr - arthuserea • Square wheels - Flickr - vrogy • Puppets - Flickr - SkipSteuart • Light bulb - Flickr - bazik • This-is-not-art - Wikimedia commons - Loran Davis • Danger of death - Flickr - zigazou76 • Bob the Builder - Flickr - jamesclay • Swiss roll - Flickr - add1sun • Orc - Flickr - photo_munki • Danger! Danger - Flickr - donsolo • Cow of the future - Flickr - thewamphyri • SCIENCE - Flickr - chasblackman Thursday, 14 March 13
  • 132. Links! • http://github.com/youdevise • http://github.com/bobtfish • https://devblog.timgroup.com/ • (Yes, we are hiring) Thursday, 14 March 13