SlideShare a Scribd company logo
1 of 25
How Jenkins Builds the Netflix Global Streaming Service

   @bmoyles @garethbowles #netflixcloud
The Engineering Tools Team
Make build, test and deploy simple and pain-free
Various skills - developers, build/release engineers,
devops guys
CI infrastructure
Common Build Framework
Bakery
Asgard
Jenkins History
 Jenkins (and H***** before it) in use at Netflix since
 2009
 Rapid acceptance and increase in usage, thanks to:
   Jenkins’ flexibility and ease of use
   Our Common Build Framework
 Quantity, size and variety of jobs means we need
 automation and extensions
What We Build
Large number of loosely-coupled Java Web Services
Common code in libraries that can be shared across
apps
Each service is “baked” - installed onto a base Amazon
Machine Image and then created as a new AMI ...
... and then deployed into a Service Cluster (a set of
Auto Scaling Groups running a particular service)
So our unit of deployment is the AMI
Getting Built
Build Pipeline                                                       ASGs


                    Artifactory      yum            Bakery        Asgard

  Jenkins               Jars/WARs        RPMs             AMIs




    sync        resolve    compile   publish       bake      deploy
                          Common Build Framework

           source


Perforce
       GitHub
Jenkins Setup for a New Build Job
Ant Setup for a New Build Job
build.xml
<project name="helloworld">
    <import file="../../../Tools/build/webapplication.xml"/>
</project>

ivy.xml
<info organisation="netflix" module="helloworld">
  <publications>
    <artifact name="helloworld" type="package"
              e:classifier="package" ext="tgz"/>
    <artifact name="helloworld" type="javadoc"
              e:classifier="javadoc" ext="jar"/>
  </publications>
  <dependencies>
    <dependency org="netflix" name="resourceregistry"
                 rev="latest.${input.status}" conf="compile"/>
    <dependency org="netflix" name="platform"
                 rev="latest.${input.status}" conf="compile" />
    ...
Jenkins at Netflix
Not Just a Build Server

Monitoring of our Cassandra clusters
Automated integration tests, including bake
and deploy
Housekeeping and monitoring of
 the build / deploy infrastructure
 our AWS environment
Jenkins Statistics
1600 job definitions, 50% SCM triggered
2000 builds per day
Common Build Framework changes trigger 800 builds
2TB of build data
Over 100 plugins
Current Jenkins Architecture
           (One Master to Build Them All)
x86_64 slave 11
 x86_64 slave 1
  x86_64 slave
 buildnode01 1
   x86_64 slave
       Standard
  buildnode01                                   custom slaves
   buildnode01
    buildnode01                                 custom slaves
                                                custom slaves
      slave group                             misc. architecture
                                                custom slaves
                                              misc. architecture
                                              misc. architecture
                                                 custom slaves
     Amazon Linux        Single Master        misc. architecture
       m1.xlarge                              misc. architecture
                                                 Ad-hoc slaves
                        Red Hat Linux
                     2x quad core x86_64   misc. O/S & architectures
                           26G RAM


x86_64 slave 11
 x86_64Custom
  x86_64slave 1
           slave
 buildnode01
                                             ~40 custom slaves
  buildnode01
       slave group
   buildnode01                              maintained by product
     Amazon Linux                                   teams
          various




AWS us-west-1 VPC     Netflix data center      Netflix DC and office
Scaling Challenges -
and How We Met Them
Jenkins Scaling Challenges
Thundering herds clogging Jenkins, downstream
systems
Making global changes across jobs
Making sure plugins can handle 1600 jobs
Testing new Jenkins and plugin versions
Backups
New Architecture (Cloudy
                 with a Chance of Builds)
     x86_64
  x86_64 slave
     slave 1
       Build                                      custom
   slave1
  buildnode01
         group              Build Master
                                                  custom
                                                   slaves
  Amazon Linux                                 Ad-hoc slaves
                                                   slaves
                         CentOS or Ubuntu          misc.
    m1.xlarge                                   misc. O/S &
                                                   misc.
                            m2.2xlarge
                                               architectures
                          (4 core / 32GB)

    x86_64
x86_64 slave
     Custom
     slave 1
       1
  slave groups                                    custom
 buildnode01
 Amazon Linux                                     custom
                            Test Master            slaves
     various                                   Ad-hoc slaves
                                                   slaves
                         CentOS or Ubuntu          misc.
                                                misc. O/S &
                                                   misc.
                            m2.2xlarge
                                               architectures
                          (4 core / 32GB)
   x86_64
x86_64 slave
   slave 1
  Test slave
      1
buildnode01
    group
Amazon Linux
  m1.xlarge
                        AWS us-west-1 VPC   AWS and Netflix office
Deploying a New Master
 EBS
           Live                    Test
                                           EBS
          master                  master


                              ENI
         Standby 1     builds.netflix.com
 EBS   Standby 2
EBS



       Active ASG              Test ASG
            Jenkins Master cluster
Building a New Master
Jenkins build watches http://mirrors.jenkins-ci.org/war/
latest/jenkins.war and downloads the new Jenkins
WAR file
Download latest versions of our installed plugins (from a
list stored externally)
Create an RPM containing the Jenkins webapp
Bake an AMI with the RPM installed under Netflix
Tomcat
Job Configuration Management
Configuration Slicing plugin
  Great for the subset of config options that it handles
  Don’t want to extend it for each new option
Job DSL plugin
  Set up new jobs with minimal definition, using
  templates and a Groovy-based DSL
  Change the template once to update all the jobs that
  use it
Old and busted...




                     The new
                    hotness...

      etc.
More Maintenance
System Groovy scripts to clean up old job configs and
associated artifacts:
  Disable jobs that haven’t built successfully for 60
  days
  Set # of builds to keep if unset
  Mark builds permanent if installed on active instances
  Delete unreferenced JARs from Artifactory
Keeping Slaves Busy

System Groovy scripts to monitor slave status and
report on problems
The Dynaslave plugin: slave self-registration
Further Reading
http://techblog.netflix.com
http://www.slideshare.net/netflix
https://github.com/netflix
http://jobs.netflix.com
   @netflixoss
Thank You To Our Sponsors
Thank you


 @bmoyles @garethbowles
Thank you
Questions?



  @bmoyles @garethbowles

More Related Content

What's hot

e2e testing with cypress
e2e testing with cypresse2e testing with cypress
e2e testing with cypressTomasz Bak
 
Using Azure DevOps to continuously build, test, and deploy containerized appl...
Using Azure DevOps to continuously build, test, and deploy containerized appl...Using Azure DevOps to continuously build, test, and deploy containerized appl...
Using Azure DevOps to continuously build, test, and deploy containerized appl...Adrian Todorov
 
Building a CICD pipeline for deploying to containers
Building a CICD pipeline for deploying to containersBuilding a CICD pipeline for deploying to containers
Building a CICD pipeline for deploying to containersAmazon Web Services
 
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...Edureka!
 
CI/CD Best Practices for Your DevOps Journey
CI/CD Best  Practices for Your DevOps JourneyCI/CD Best  Practices for Your DevOps Journey
CI/CD Best Practices for Your DevOps JourneyDevOps.com
 
DevOps Approach (Point of View by Ravi Tadwalkar)
DevOps Approach (Point of View by Ravi Tadwalkar)DevOps Approach (Point of View by Ravi Tadwalkar)
DevOps Approach (Point of View by Ravi Tadwalkar)Ravi Tadwalkar
 
What is Jenkins | Jenkins Tutorial for Beginners | Edureka
What is Jenkins | Jenkins Tutorial for Beginners | EdurekaWhat is Jenkins | Jenkins Tutorial for Beginners | Edureka
What is Jenkins | Jenkins Tutorial for Beginners | EdurekaEdureka!
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as CodeRobert Greiner
 
Integrating Automated Testing into DevOps
Integrating Automated Testing into DevOpsIntegrating Automated Testing into DevOps
Integrating Automated Testing into DevOpsTechWell
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsJulian Mazzitelli
 
DevOps Overview
DevOps OverviewDevOps Overview
DevOps OverviewSagar Mody
 
An Introduction To Jenkins
An Introduction To JenkinsAn Introduction To Jenkins
An Introduction To JenkinsKnoldus Inc.
 

What's hot (20)

CICD with Jenkins
CICD with JenkinsCICD with Jenkins
CICD with Jenkins
 
e2e testing with cypress
e2e testing with cypresse2e testing with cypress
e2e testing with cypress
 
Using Azure DevOps to continuously build, test, and deploy containerized appl...
Using Azure DevOps to continuously build, test, and deploy containerized appl...Using Azure DevOps to continuously build, test, and deploy containerized appl...
Using Azure DevOps to continuously build, test, and deploy containerized appl...
 
Building a CICD pipeline for deploying to containers
Building a CICD pipeline for deploying to containersBuilding a CICD pipeline for deploying to containers
Building a CICD pipeline for deploying to containers
 
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...
What Is DevOps? | Introduction To DevOps | DevOps Tools | DevOps Tutorial | D...
 
CI/CD Best Practices for Your DevOps Journey
CI/CD Best  Practices for Your DevOps JourneyCI/CD Best  Practices for Your DevOps Journey
CI/CD Best Practices for Your DevOps Journey
 
DevOps and Tools
DevOps and ToolsDevOps and Tools
DevOps and Tools
 
DevOps Best Practices
DevOps Best PracticesDevOps Best Practices
DevOps Best Practices
 
DevOps Approach (Point of View by Ravi Tadwalkar)
DevOps Approach (Point of View by Ravi Tadwalkar)DevOps Approach (Point of View by Ravi Tadwalkar)
DevOps Approach (Point of View by Ravi Tadwalkar)
 
What is Jenkins | Jenkins Tutorial for Beginners | Edureka
What is Jenkins | Jenkins Tutorial for Beginners | EdurekaWhat is Jenkins | Jenkins Tutorial for Beginners | Edureka
What is Jenkins | Jenkins Tutorial for Beginners | Edureka
 
Cypress Automation
Cypress  AutomationCypress  Automation
Cypress Automation
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as Code
 
Azure DevOps
Azure DevOpsAzure DevOps
Azure DevOps
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
Integrating Automated Testing into DevOps
Integrating Automated Testing into DevOpsIntegrating Automated Testing into DevOps
Integrating Automated Testing into DevOps
 
Jenkins Overview
Jenkins OverviewJenkins Overview
Jenkins Overview
 
Designing a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd productsDesigning a complete ci cd pipeline using argo events, workflow and cd products
Designing a complete ci cd pipeline using argo events, workflow and cd products
 
DevOps Overview
DevOps OverviewDevOps Overview
DevOps Overview
 
Jenkins
JenkinsJenkins
Jenkins
 
An Introduction To Jenkins
An Introduction To JenkinsAn Introduction To Jenkins
An Introduction To Jenkins
 

Viewers also liked

Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016DataStax
 
Continuous Delivery at Netflix
Continuous Delivery at NetflixContinuous Delivery at Netflix
Continuous Delivery at NetflixRob Spieldenner
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraEric Evans
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLEric Evans
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonDataStax Academy
 
Building Cloud Tools for Netflix with Jenkins
Building Cloud Tools for Netflix with JenkinsBuilding Cloud Tools for Netflix with Jenkins
Building Cloud Tools for Netflix with JenkinsGareth Bowles
 
Netflix API - Separation of Concerns
Netflix API - Separation of ConcernsNetflix API - Separation of Concerns
Netflix API - Separation of ConcernsDaniel Jacobson
 
NetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapNetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapRuslan Meshenberg
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Rick Branson
 
Continuous delivery applied
Continuous delivery appliedContinuous delivery applied
Continuous delivery appliedMike McGarr
 
Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...
Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...
Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...DataStax Academy
 
Continuous Delivery at Netflix, and beyond
Continuous Delivery at Netflix, and beyondContinuous Delivery at Netflix, and beyond
Continuous Delivery at Netflix, and beyondMike McGarr
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talkPatrick McFadin
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Jeremy Edberg
 
The DynaSlave Plugin
The DynaSlave PluginThe DynaSlave Plugin
The DynaSlave PluginBrian Moyles
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 

Viewers also liked (17)

Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
Everyday I'm Scaling... Cassandra (Ben Bromhead, Instaclustr) | C* Summit 2016
 
Continuous Delivery at Netflix
Continuous Delivery at NetflixContinuous Delivery at Netflix
Continuous Delivery at Netflix
 
Outside The Box With Apache Cassnadra
Outside The Box With Apache CassnadraOutside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
 
Cassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQLCassandra: Not Just NoSQL, It's MoSQL
Cassandra: Not Just NoSQL, It's MoSQL
 
C* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick BransonC* Summit 2013: Cassandra at Instagram by Rick Branson
C* Summit 2013: Cassandra at Instagram by Rick Branson
 
Building Cloud Tools for Netflix with Jenkins
Building Cloud Tools for Netflix with JenkinsBuilding Cloud Tools for Netflix with Jenkins
Building Cloud Tools for Netflix with Jenkins
 
Netflix API - Separation of Concerns
Netflix API - Separation of ConcernsNetflix API - Separation of Concerns
Netflix API - Separation of Concerns
 
NetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapNetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmap
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)
 
Continuous delivery applied
Continuous delivery appliedContinuous delivery applied
Continuous delivery applied
 
Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...
Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...
Cassandra Summit 2014: Cassandra @ Netflix: Building a House of Cards on a So...
 
Continuous Delivery at Netflix, and beyond
Continuous Delivery at Netflix, and beyondContinuous Delivery at Netflix, and beyond
Continuous Delivery at Netflix, and beyond
 
Cassandra Virtual Node talk
Cassandra Virtual Node talkCassandra Virtual Node talk
Cassandra Virtual Node talk
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)
 
The DynaSlave Plugin
The DynaSlave PluginThe DynaSlave Plugin
The DynaSlave Plugin
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Culture
CultureCulture
Culture
 

Similar to How Jenkins Builds the Netflix Global Streaming Service

Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in LinuxSadegh Dorri N.
 
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul MaddoxAWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul MaddoxAWS Riyadh User Group
 
Linux container & docker
Linux container & dockerLinux container & docker
Linux container & dockerejlp12
 
Windsor: Domain 0 Disaggregation for XenServer and XCP
	Windsor: Domain 0 Disaggregation for XenServer and XCP	Windsor: Domain 0 Disaggregation for XenServer and XCP
Windsor: Domain 0 Disaggregation for XenServer and XCPThe Linux Foundation
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISORVanika Kapoor
 
Docker: under the hood
Docker: under the hoodDocker: under the hood
Docker: under the hoodJake Wise
 
Creating an effective developer experience on Kubernetes
Creating an effective developer experience on KubernetesCreating an effective developer experience on Kubernetes
Creating an effective developer experience on KubernetesLenses.io
 
Docker - Portable Deployment
Docker - Portable DeploymentDocker - Portable Deployment
Docker - Portable Deploymentjavaonfly
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackBoden Russell
 
Weave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 RecapWeave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 RecapPatrick Chanezon
 
CI and CD at Scale: Scaling Jenkins with Docker and Apache Mesos
CI and CD at Scale: Scaling Jenkins with Docker and Apache MesosCI and CD at Scale: Scaling Jenkins with Docker and Apache Mesos
CI and CD at Scale: Scaling Jenkins with Docker and Apache MesosCarlos Sanchez
 
Automating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with PuppetAutomating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with Puppetbuildacloud
 
Unikernels: the rise of the library hypervisor in MirageOS
Unikernels: the rise of the library hypervisor in MirageOSUnikernels: the rise of the library hypervisor in MirageOS
Unikernels: the rise of the library hypervisor in MirageOSDocker, Inc.
 
(ATS3-DEV02) Scripting with .NET Assemblies in Symyx Notebook
(ATS3-DEV02) Scripting with .NET Assemblies in Symyx Notebook(ATS3-DEV02) Scripting with .NET Assemblies in Symyx Notebook
(ATS3-DEV02) Scripting with .NET Assemblies in Symyx NotebookBIOVIA
 
Collaborate vdb performance
Collaborate vdb performanceCollaborate vdb performance
Collaborate vdb performanceKyle Hailey
 
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...DynamicInfraDays
 
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019kanedafromparis
 
Unikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library HypervisorUnikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library HypervisorAnil Madhavapeddy
 
Kernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: DebianKernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: DebianAnne Nicolas
 

Similar to How Jenkins Builds the Netflix Global Streaming Service (20)

Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in Linux
 
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul MaddoxAWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
AWS reinvent 2019 recap - Riyadh - Containers and Serverless - Paul Maddox
 
Linux container & docker
Linux container & dockerLinux container & docker
Linux container & docker
 
Windsor: Domain 0 Disaggregation for XenServer and XCP
	Windsor: Domain 0 Disaggregation for XenServer and XCP	Windsor: Domain 0 Disaggregation for XenServer and XCP
Windsor: Domain 0 Disaggregation for XenServer and XCP
 
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISORLOAD BALANCING OF APPLICATIONS  USING XEN HYPERVISOR
LOAD BALANCING OF APPLICATIONS USING XEN HYPERVISOR
 
Docker: under the hood
Docker: under the hoodDocker: under the hood
Docker: under the hood
 
Creating an effective developer experience on Kubernetes
Creating an effective developer experience on KubernetesCreating an effective developer experience on Kubernetes
Creating an effective developer experience on Kubernetes
 
Docker - Portable Deployment
Docker - Portable DeploymentDocker - Portable Deployment
Docker - Portable Deployment
 
KVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStackKVM and docker LXC Benchmarking with OpenStack
KVM and docker LXC Benchmarking with OpenStack
 
Weave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 RecapWeave User Group Talk - DockerCon 2017 Recap
Weave User Group Talk - DockerCon 2017 Recap
 
CI and CD at Scale: Scaling Jenkins with Docker and Apache Mesos
CI and CD at Scale: Scaling Jenkins with Docker and Apache MesosCI and CD at Scale: Scaling Jenkins with Docker and Apache Mesos
CI and CD at Scale: Scaling Jenkins with Docker and Apache Mesos
 
Automating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with PuppetAutomating Your CloudStack Cloud with Puppet
Automating Your CloudStack Cloud with Puppet
 
Unikernels: the rise of the library hypervisor in MirageOS
Unikernels: the rise of the library hypervisor in MirageOSUnikernels: the rise of the library hypervisor in MirageOS
Unikernels: the rise of the library hypervisor in MirageOS
 
(ATS3-DEV02) Scripting with .NET Assemblies in Symyx Notebook
(ATS3-DEV02) Scripting with .NET Assemblies in Symyx Notebook(ATS3-DEV02) Scripting with .NET Assemblies in Symyx Notebook
(ATS3-DEV02) Scripting with .NET Assemblies in Symyx Notebook
 
Collaborate vdb performance
Collaborate vdb performanceCollaborate vdb performance
Collaborate vdb performance
 
The State of Linux Containers
The State of Linux ContainersThe State of Linux Containers
The State of Linux Containers
 
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
 
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
Dev opsec dockerimage_patch_n_lifecyclemanagement_2019
 
Unikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library HypervisorUnikernels: Rise of the Library Hypervisor
Unikernels: Rise of the Library Hypervisor
 
Kernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: DebianKernel maintainance in Linux distributions: Debian
Kernel maintainance in Linux distributions: Debian
 

Recently uploaded

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Recently uploaded (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

How Jenkins Builds the Netflix Global Streaming Service

  • 1. How Jenkins Builds the Netflix Global Streaming Service @bmoyles @garethbowles #netflixcloud
  • 2. The Engineering Tools Team Make build, test and deploy simple and pain-free Various skills - developers, build/release engineers, devops guys CI infrastructure Common Build Framework Bakery Asgard
  • 3. Jenkins History Jenkins (and H***** before it) in use at Netflix since 2009 Rapid acceptance and increase in usage, thanks to: Jenkins’ flexibility and ease of use Our Common Build Framework Quantity, size and variety of jobs means we need automation and extensions
  • 4. What We Build Large number of loosely-coupled Java Web Services Common code in libraries that can be shared across apps Each service is “baked” - installed onto a base Amazon Machine Image and then created as a new AMI ... ... and then deployed into a Service Cluster (a set of Auto Scaling Groups running a particular service) So our unit of deployment is the AMI
  • 6. Build Pipeline ASGs Artifactory yum Bakery Asgard Jenkins Jars/WARs RPMs AMIs sync resolve compile publish bake deploy Common Build Framework source Perforce GitHub
  • 7. Jenkins Setup for a New Build Job
  • 8. Ant Setup for a New Build Job build.xml <project name="helloworld"> <import file="../../../Tools/build/webapplication.xml"/> </project> ivy.xml <info organisation="netflix" module="helloworld"> <publications> <artifact name="helloworld" type="package" e:classifier="package" ext="tgz"/> <artifact name="helloworld" type="javadoc" e:classifier="javadoc" ext="jar"/> </publications> <dependencies> <dependency org="netflix" name="resourceregistry" rev="latest.${input.status}" conf="compile"/> <dependency org="netflix" name="platform" rev="latest.${input.status}" conf="compile" /> ...
  • 10. Not Just a Build Server Monitoring of our Cassandra clusters Automated integration tests, including bake and deploy Housekeeping and monitoring of the build / deploy infrastructure our AWS environment
  • 11. Jenkins Statistics 1600 job definitions, 50% SCM triggered 2000 builds per day Common Build Framework changes trigger 800 builds 2TB of build data Over 100 plugins
  • 12. Current Jenkins Architecture (One Master to Build Them All) x86_64 slave 11 x86_64 slave 1 x86_64 slave buildnode01 1 x86_64 slave Standard buildnode01 custom slaves buildnode01 buildnode01 custom slaves custom slaves slave group misc. architecture custom slaves misc. architecture misc. architecture custom slaves Amazon Linux Single Master misc. architecture m1.xlarge misc. architecture Ad-hoc slaves Red Hat Linux 2x quad core x86_64 misc. O/S & architectures 26G RAM x86_64 slave 11 x86_64Custom x86_64slave 1 slave buildnode01 ~40 custom slaves buildnode01 slave group buildnode01 maintained by product Amazon Linux teams various AWS us-west-1 VPC Netflix data center Netflix DC and office
  • 13. Scaling Challenges - and How We Met Them
  • 14. Jenkins Scaling Challenges Thundering herds clogging Jenkins, downstream systems Making global changes across jobs Making sure plugins can handle 1600 jobs Testing new Jenkins and plugin versions Backups
  • 15. New Architecture (Cloudy with a Chance of Builds) x86_64 x86_64 slave slave 1 Build custom slave1 buildnode01 group Build Master custom slaves Amazon Linux Ad-hoc slaves slaves CentOS or Ubuntu misc. m1.xlarge misc. O/S & misc. m2.2xlarge architectures (4 core / 32GB) x86_64 x86_64 slave Custom slave 1 1 slave groups custom buildnode01 Amazon Linux custom Test Master slaves various Ad-hoc slaves slaves CentOS or Ubuntu misc. misc. O/S & misc. m2.2xlarge architectures (4 core / 32GB) x86_64 x86_64 slave slave 1 Test slave 1 buildnode01 group Amazon Linux m1.xlarge AWS us-west-1 VPC AWS and Netflix office
  • 16. Deploying a New Master EBS Live Test EBS master master ENI Standby 1 builds.netflix.com EBS Standby 2 EBS Active ASG Test ASG Jenkins Master cluster
  • 17. Building a New Master Jenkins build watches http://mirrors.jenkins-ci.org/war/ latest/jenkins.war and downloads the new Jenkins WAR file Download latest versions of our installed plugins (from a list stored externally) Create an RPM containing the Jenkins webapp Bake an AMI with the RPM installed under Netflix Tomcat
  • 18. Job Configuration Management Configuration Slicing plugin Great for the subset of config options that it handles Don’t want to extend it for each new option Job DSL plugin Set up new jobs with minimal definition, using templates and a Groovy-based DSL Change the template once to update all the jobs that use it
  • 19. Old and busted... The new hotness... etc.
  • 20. More Maintenance System Groovy scripts to clean up old job configs and associated artifacts: Disable jobs that haven’t built successfully for 60 days Set # of builds to keep if unset Mark builds permanent if installed on active instances Delete unreferenced JARs from Artifactory
  • 21. Keeping Slaves Busy System Groovy scripts to monitor slave status and report on problems The Dynaslave plugin: slave self-registration
  • 23. Thank You To Our Sponsors
  • 24. Thank you @bmoyles @garethbowles
  • 25. Thank you Questions? @bmoyles @garethbowles

Editor's Notes

  1. Abstract: Hi, I&amp;#x2019;m Gareth and this is Brian, we&amp;#x2019;re from Netflix. Welcome to part 2 of the Netflix invasion of the Jenkins User Conference !\n\nOver the last couple of years, Netflix&amp;#x2019; streaming movie and TV service has moved almost completely to the cloud, using Amazon Web Services. You guys may well have seen some of the great presentations and blog posts by our engineers talking about how our production service works - but what does it take to get that service into production ? This talk will delve into our build and deployment architecture, concentrating on how we use Jenkins as the centrepiece of our production line for the cloud. \n
  2. Brian and I work on the Engineering Tools team at Netflix. There are 6 of us here today (about half our team, including Justin who gave the talk on getting to your 3rd plugin this morning), so find any of us later if you want to know more about Netflix (especially if you&amp;#x2019;d like to work with us, we&amp;#x2019;re hiring !)\n\nOur team is responsible for making standard tools for all our engineers to make building, testing and deploying their products simple and pain-free. \n\nWe&amp;#x2019;re now a fairly big team with a diverse range of skills, everyone can go pretty far up and down the stack (although don&amp;#x2019;t ask me to do any UI work if you don&amp;#x2019;t want to make your eyes bleed !)\n\nOur team runs the Netflix CI infrastructure, including a Common Build Framework that we wrote to make it trivial for developers to create builds for their products.\n\nAmong other tools that we wrote are the Bakery, which creates our custom Amazon Machine Images for deploying to the cloud, and Asgard, the open source console that you can use to manage your AWS deployments. More on these coming up.\n
  3. We&amp;#x2019;ve been using Jenkins (and its predecessor who Shall Not Be Named) for a few years now and it&amp;#x2019;s become a vital part of our CI infrastructure. \n\nUse of Jenkins caught on really fast, partly because it&amp;#x2019;s easy for people who aren&amp;#x2019;t professional build engineers to use it, and also due to our Common Build Framework that makes it a trivial operation to set up a new build job (more on that later). In fact, Netflix&amp;#x2019;s product teams don&amp;#x2019;t have full time release engineers; our Engineering Tools team aims to make the build / test / deploy process trivial enough not to get in the way of product development. Jenkins has been a big help with that. \n\nWe&amp;#x2019;ve become one of the bigger Jenkins users (although not the biggest, for those of you who saw the &amp;#x201C;Planned Parenthood&amp;#x201D; talk by the Intel guys earlier, they are an order of magnitude bigger than us) quite fast, and that&amp;#x2019;s given our team quite a few challenges in meeting the demand. We&amp;#x2019;ve come up with custom plugins, scripts and use patterns for Jenkins that we think are pretty cool and certainly make our lives easier; we&amp;#x2019;ll share many of them in this talk. \n
  4. Let&amp;#x2019;s start with some background on what we build using Jenkins.\n\nTo get to the cloud, we rearchitected the Netflix streaming service from a monolithic Java webapp into many individual modules implemented as loosely-coupled web services, usually web applications or shared libraries (JARs). Most services implement a server and a client library that other services can use to talk to it.\n\nWe decided that our unit of deployment would be the Amazon Machine Image - a complete description of a server including the operating system, not just an install package. This frees us from having to maintain a configuration server such as Puppet or Chef, and makes it very easy to roll back and forward between versions of a service - you just kill instances of a version you don&amp;#x2019;t want and spin up instances with a different version. Netflix is all about continually trying out new features and rolling forward (preferably) or back fast if needed.\n\nWe maintain a base AMI that encapsulates our selected o/s - currently CentOS, soon to be Ubuntu - plus a host of other basic services that provide monitoring, logging, web and application servers (Apache HTTPD and Tomcat) and much more. We have an application called the Bakery (soon to be open sourced) which handles installation of Netflix services onto a base AMI to create a custom AMI for each service.\n\nOnce the AMIs are ready, we deploy them into AWS Auto Scaling Groups using Asgard, our open source management console. Asgard is a fantastic piece of work that goes way beyond what the standard AWS Management Console can do, and I encourage any of you who use AWS to check it out.\n\nNote that a key aspect of using so many shared services is that each service team has to rebuild often in order to pick up changes from the other services that they depend on. This is the CONTINUOUS part of continuous integration and is where Jenkins comes in.\n
  5. Here are a few details on how we build all those cloud services.\n
  6. Here are the tools that form our continuous build and deployment pipeline. Jenkins jobs can provide the entire workflow.\n\nWe wrote a Common Build Framework, based on Ant with some custom Groovy scripting, that&amp;#x2019;s used by all our development teams to build different kinds of libraries and apps. (If you think Ant is a bit long in the tooth, so do we; we&amp;#x2019;re working on a brand new build framework using Gradle, but that will take a while to finish up and get everyone converted over).\n\nFor the continuous integration to run all those builds, we picked Jenkins because it&amp;#x2019;s very feature rich, easy to extend, and has a very active and helpful community (thanks, guys !). \n\nWe use Perforce for our version control system as it&amp;#x2019;s arguably the best centralized VCS available. But we&amp;#x2019;re making increasing use of Git; for example, our many open sourced projects are all hosted on GitHub, and we use Jenkins to build them. \n\nWe publish library JARs and application WAR files to the Artifactory binary repository tool. This lets us access and add to the metadata for each published artifact, and allows us to add Apache Ivy to Ant to handle dependency resolution. So each project only has to know about its immediate dependencies.\n\nJARs and WARs are then bundled into RPM packages for installation.\n\nI mentioned the bake and deploy process for the finished artifacts just now. The Common Build Framework includes Ant tasks that access the Bakery and Asgard APIs to run these operations. Being able to do that workflow from a Jenkins job can save a lot of time; for example, our API team was able to move from one deployment every two weeks to multiple deployments per week.\n\nIt&amp;#x2019;s worth mentioning that the full automated deployment cycle in mainly used in our test environment; deploying to production needs extra checks and balances that tend to be specific to the service being deployed, many of those aren&amp;#x2019;t yet fully automated. \n\n\n
  7. Here is all you need to do in Jenkins to set up a typical project&amp;#x2019;s build job. You just tell Jenkins where to find the source code and add in the Common Build Framework, then specify where your Ant build file is and what targets to call from it.\n\nEven this minimal setup can be tedious when repeated for a large number of jobs, so we&amp;#x2019;re working on a plugin to make it even more trivial; more on that later.\n\n
  8. And here is a simple project&amp;#x2019;s Ant build file and Ivy publish &amp; dependency definition file. \n\nYou can see the Ant code simply pulls in one of the standard framework entry points like, library, webapplication, etc. \n\nThe Ivy publications section just defines what type of artifacts you want to publish; the Common Build Framework has the knowledge of how to create those artifacts.\n\n\n
  9. Let&amp;#x2019;s take a closer look at how we use Jenkins as the core of our build infrastructure, plus a few other interesting uses we&amp;#x2019;ve come up with.\n
  10. At its heart Jenkins is just a really nice job scheduler, so we&amp;#x2019;ve found lots of other uses for it. Here are some of the main ones.\n\nOur Cassandra team has their own Jenkins instance for monitoring the health of our Cassandra clusters and performing automated repairs on them. This one is a great testimonial for Jenkins&amp;#x2019; ease of use; Engineering Tools just gave the Cassandra guys a basic Jenkins master and some slaves, then they set up their own jobs, a custom dashboard, and even turned the blue balls for build success to green ones. \n\nHousekeeping jobs usually use system Groovy scripts for access to the Jenkins runtime. We effectively use Jenkins to manage Jenkins and its associated apps, in a similar way to the Infrastructure jobs on https://ci.jenkins-ci.org.\n\nWe&amp;#x2019;ve posted a few of these scripts to the public Scriptler repository and will be adding more. Please check them out and feel free to make changes.\n\n\n
  11. The CBF build floods run in 2 stages: the first main one is triggered by the SCM change, then downstream test and deploy jobs are triggered by Jenkins. \n\n\n
  12. Our Jenkins master runs on a physical server in our data center. \n\nWe use the master strictly as a build orchestrator and a runner for housekeeping scripts; we don&amp;#x2019;t run regular builds on the master to avoid overloading it.\n\nSlave servers are used to execute the actual builds. Our standard slaves can each run 4 simultaneous builds. Custom slave groups are set up for requirements such as C/C++ builds or jobs with high CPU or memory needs.\n\nWe make extensive use of slave labelling to control the type of slave that each job executes on, and we have housekeeping jobs that make sure that build jobsare using the correct labels. \n\nWe vary the number of standard slaves from 10 to 30 depending on demand. This is currently a manual operation (but all you have to do is run a Jenkins job !); we&amp;#x2019;re working on autoscaling.\n\nOur cloud slaves are set up in an AWS Virtual Private Cloud (VPC), which provides common network access between our data centre and AWS. Amazon&amp;#x2019;s us-west-1 region is physically located close to our data centre, so latency is not an issue.\n\nAd-hoc slaves in our DC or office are used by individual teams if they need an O/S variant other than those on our standard slaves, or a specific tool or licensed app. Typical uses are for our Partner Product Development team, which makes the Netflix clients that run on TVs, DVD players, smartphones, tablets, and soon - I&amp;#x2019;m reliably informed - toasters. \n\nWe keep our standard slaves updated by maintaining a common set of tools (JDKs, Ant, Groovy, etc.) on the master and syncing the tools to the slaves when they are restarted. Custom slaves can also use this mechanism if they choose.\n\nThe single master has definitely reduce maintenance overhead and, thanks to continuing performance improvements from the Jenkins community, it&amp;#x2019;s handled our expanding workload pretty well up to now. Still, it&amp;#x2019;s a single point of failure, and we&amp;#x2019;re starting to run up against a number of scaling issues with this architecture. \n\nThat&amp;#x2019;s the present - now I&amp;#x2019;ll hand it over to Brian to take us into the future.\n\n\n \n
  13. Thanks Gareth. I&amp;#x2019;m going to talk about some of the scaling challenges we&amp;#x2019;ve faced in building and growing our Jenkins infrastructure, and what we&amp;#x2019;ve done to address them.\n
  14. We&amp;#x2019;ve run into a number of scaling challenges as we&amp;#x2019;ve evolved our build pipeline: Thundering herd problems, modifying and managing 1600 jobs, making sure those 1600 jobs work from Jenkins version to Jenkins version, plugin version to plugin version, and so on.\n\nA good example problem: when we update our build framework, this often triggers all SCM triggered jobs that use the CBF. We can scale our slave pool up, but the slaves push to artifactory, resolve deps from artifactory, and so on, so more slaves means more artifactory load. It&amp;#x2019;s a balancing act.\n\nOur goal, of course, is to have one button build/test/deploy with as little human intervention as possible, and make the developer&amp;#x2019;s life as pain-free as we can. All of these scaling issues get in the way of that. Here are some of the ways we&amp;#x2019;re working on eliminating them.\n
  15. To start, we&amp;#x2019;re working on a sharded model for our builds.\nOur current thought is we&amp;#x2019;ll split the current master into multiple AWS instances dedicated to different types of work: standard CI builds, integration tests, non-Java builds for our device clients, etc.\n\nThe set of plugins for each master can be streamlined to fit the type of job. Isolate long-running jobs so we don&amp;#x2019;t bounce Jenkins for maintenance in the middle of someone&amp;#x2019;s 10 hour integration test, schedule regular planned maintenance for the long-running cluster, etc.\n\nWe&amp;#x2019;ll use the same bake &amp; deploy architecture that powers our streaming service to run the Jenkins masters and slaves.\n\n\n\n \n
  16. To deploy the Jenkins master, we&amp;#x2019;ll use the same &amp;#x201C;Red / Black deploy&amp;#x201D; principle that our streaming service uses in production. We launch and test the master in a manner that prevents it from taking on traffic. When we&amp;#x2019;re satisfied with where things are, we flip a switch, and the test becomes the real master.\n\nTo go into more detail:\nThe live ASG will have a master that&amp;#x2019;s running builds, plus one or more standby instances with builds disabled. We store job data on EBS volumes--the Amazon term for persistent storage that doesn&amp;#x2019;t disappear with cloud nodes. The job data is synced regularly from the live master to the standbys.\n\nThe test ASG has a similar job data setup, but uses a representative subset of jobs to give us confidence in the new master and plugin versions.\n\nIf we want to switch to a new master version, or fail over to one of the hot standbys, we bring down the live master, attach the live job data volume to the new master, and finally do a network switch to make the new master live. We&amp;#x2019;re planning to use Elastic Network Interfaces (ENIs) to do the network switchover; these are virtual network interfaces which can be moved from one EC2 instance to another. \n\nWe haven&amp;#x2019;t fully worked out how we will do job data replication and network switchover yet, so we&amp;#x2019;d be interested to hear your experiences if you have a similar setup. CloudBees&amp;#x2019; HA plugin might be an option here; we&amp;#x2019;ll definitely be looking at that.\n\n\n\n\n\n\n
  17. We&amp;#x2019;ve set up a Jenkins build to make us a new master image as soon as a new Jenkins release is available.\nWe could have used the standard Jenkins RPMs, but this way lets us deploy to the same Netflix Tomcat used by production services and get things like monitoring, logging, Apache proxy etc. for free from our base AMI.\n\n\n
  18. How do we change everyone to a new version of a tool, add a new plugin to everyone&amp;#x2019;s job or make other common configuration changes to a large number of jobs ?\n\nThe Configuration Slicing plugin is really nice as long as you want to change one of the subset of options that it handles.\n\nJustin from our team, who presented on &amp;#x201C;Getting to Your Third Plugin&amp;#x201D; earlier, has written a job DSL plugin that will allow us to create job templates and simplify the process of configuring new jobs, using a Groovy DSL to define the configuration. Here&amp;#x2019;s a quick peek at how easy that is:\n\n\n
  19. On the left, a typical job configuration page.\nOn the right, a peek at the groovy DSL you&amp;#x2019;d use to set up a job. This can be made even smaller using templates--pull out the job name and repo location, and you can have a very tiny, simple, reusable representation of a much more complex object, allowing you to do things in a very repeatable, consistent fashion.\n\n
  20. We also employ a number of system groovy scripts for housekeeping both inside Jenkins itself and outside. Jobs will disable abandoned jobs that have been failing for 60 days, ensure everyone&amp;#x2019;s retaining some build history, flagging builds as permanent if they get used in production, and so on...\n\n
  21. To help us keep builds moving through the system, we need to keep our slave fleet in shape.\n\nWe have system groovy scripts keeping watch over the slaves that complain when something comes up that requires attention. For example, If a specialized slave goes offline for an extended period of time, the team that owns the slave will be notified so they can take action.\n\nWe also wrote the dynaslave plugin to help manage slave registration, isolation, and cleanup after disconnect.\nA quick note for those who cannot make it to the lightning talk--the dynaslave plugin, very simply, allows for slave self-registration without having to tell Jenkins anything about their actual infrastructure. We&amp;#x2019;ll be open sourcing this plugin, so check out the lightning talk or look for our slides on the subject after the conference.\n\nWe currently manage some aspects of the dynaslave system using groovy scripts as well. For those that saw our colleague Justin Ryan&amp;#x2019;s presentation earlier, always reach for Groovy first when you feel the temptation to play with plugin internals :)\n
  22. If you found that interesting, you can read more at these links, especially the jobs link :) \nCheck out the Netflix OSS twitter account for announcements around our open source efforts.\n\n
  23. A big thanks to Cloudbees and the sponsors of the conference for allowing us the opportunity to speak to everyone\n
  24. And thanks to you all for listening. Questions?\n