Old staging methodology is broken for modern development. In fact, the staging server is left over from when we built monolithic applications. Find out why microservice architectures are driving ephemeral testing environments & why every sized dev shop should deliver true continuous deployment.
Staging servers slow down development with merge conflicts, slow iteration loops, and manhour intensive processes. To build better software faster containers and infrastructure as code are key in 2017. Dev Ops professionals miss this talk at their own peril.
2. About Chloe
‣ Developer, Mentor/Advocate
‣ Developer Evangelist at Codefresh
‣ Blogger of all things container, Docker,
and diversity related
@ChloeCondon @ChloeCondon
3. Codefresh is a Docker-Native CI/CD platform for Dev
teams. It allows teams to automate the process of
building, testing, and deploying containerized
applications.
ABOUT CODEFRESH
4. • Common CI/CD implementations
• Challenges
• Containers and their impact on traditional CI/CD
AGENDA
1
2
3
5. develop/staging
✓unit tests
✓unit tests
✓code review
pull-request
✓unit tests
✓integration test
✓performance test
✓manual testing
✓security testing
✓build & push to registry
✓push to registry
✓deploy to production
master
COMMON CI/CD PROCESS IMPLEMENTATION
21. develop
feature-branch
✓unit tests
✓integration test
✓performance test
✓UX
✓unit tests
✓integration test
✓performance test
✓ui & manual test
✓UX
pull-request
✓unit tests
✓integration test
✓SLA based performance test
✓ui & manual test
master
DOCKER NATIVE PIPELINE
22. DOCKER NATIVE
✓ Test every build
✓ Unit, Integration, Security, etc
✓ Staging before pull request
24x
FASTER SOFTWARE
DEVELOPMENT
My name is Chloe and I’m a developer evangelist at Codefresh.
Today I’ll be talking about the somewhat controversial topic of “Why You Need to Stop Using the staging server”. I know this is a bold statement, but I really mean it!
The intention of what I want to show you in this talk is to give you a new perception of how to view the staging environment. For quite some time, we’ve been implementing CI/CD using a staging server, and the staging server has served as a very meaningful part of our pipelines. But what I’d like to show you today is that given today’s latest technologies, we can actually rethink our pipeline and we can rethink the way we implement continuous integration and continuous delivery. And, in fact, make it much faster and streamlined.
Before we dive in, a little about me: my name is Chloe, I live in the Bay Area and help organize the Containers 101 group on meetup. I’m a developer as well as an advocate and mentor for women in tech. I’m the developer evangelist at Codefresh, and I’m a blogger of all things container, Docker, and diversity related.
So, a little bit about Codefresh. We are a Docker-native CI/CD. We’re a platform that automates the process of building, testing, and deploying containers. It’s built with Docker users in mind, since we’ve seen the way containers have disrupted the way we build and deploy applications.
Codefresh is free to use
Repos
Builds
users
Here’s a mini-agenda for what I’ll be talking about today. First, we’ll be talking about Common CI/CD implementations, the challenges we run into with those common implementations, and finally we’ll chat about containers and their impact on traditional CI/CD.
How many of you have seen git flow before- or know git flow? Great, so I don’t need to go into too much detail here. The main thing that I want to highlight in these slides is that many of you have probably implemented CI slightly different, so I won’t have a conversation preaching what this “should” look like. But I want to show this main model of a git flow. A real quick rundown of gitflow: we have the branch master (what we have in production), we have the staging/develop branch (which is what we have in staging), and whenever a developer gets a task assigned to him he’s branching out of the develop branch, he’s working on a feature branch, and at some point in time he’ll make a pull request, his pull request (if approved) will be merged back into staging, which eventually will be merged back into master/production.
Now, if we look at the set of activities at each point of time when we’re writing our codes and working on the feature branch, it’s likely that we also (if we’re covering our bases) we’re writing unit tests. As good practice, we write our unit tests, see our unit test has passed, that we have good coverage. So, eventually, when we make a pull request, we see that the code and code changes have been covered, but we also have the peer review of testing/reviewing our code.
The point being that when it hits staging (in the traditional way of continuos integration), it’s hitting a much broader set of tests. That’s where we start running not only unit tests, but integration tests, UI tests, using selenium, performance tests, and in some cases some manual tests to our code.
Now if we look at this on a slightly different timeline. You’ll see we’re working on our feature branch and making a pull request, but at staging… that’s where most of the heavy lifting tests are actually taking place. And on this timeline, you’ll also see on the bottom part, that is the most costly for us to go and fix these issues. So, if we find some issues while we’re working on our feature branch, it’s much easier to incorporate these changes, to make the fixes. The later we find the issues, the more expensive and costly it is.
Another aspect of the traditional way we develop and implement with continuous integration is the “hand shake” that we have between development and deployment. A typical handshake will be a label/stamp in the code base, the test report/test results, sometimes some proposed changes in the DB, and a list of the known issues (communicating this as part of the release notes).
So, what are the challenges? We’ve touched on a few, but I’ll review a couple we see.
Firstly, we’re all familiar with someone broke staging. Usually it holds us all back until it can be fixed. This is frustrating to the whole team, but even more so to the developers who made that change and broke the staging environment for everyone else. That’s likely to happen when we implement a process in which our code changes are being tested for UI and integration at the staging environment for the first time.
In this case, as a developer, the first time people are seeing my feature (up against the main branch) is at staging. This scenario is incredibly frustrating since if I’m given any feedback, I’m unable to implement that feedback since the typical attitude is “well, let’s deploy it now and fix it later”. And I don’t know when those changes will be added/ if I’ll have to re-prioritize them. I’d much rather be able to implement that feedback now vs. later.
And last but not least, are the frictions we have (which is a great segway to containers), there are always frictions that we find between development environments, to staging environments, to production environments- and issues usually stem from these differences and not the code base itself. That’s where we have some flows in the process that are very time consuming to find why these issues are happening. And eventually we find out that these issues aren’t about the code, but about some environmental variables that we have on each environment.
So, to summarize, before I talk a little more about containers, the main issue here is that code changes and pull requests are being tested much more extensively (for the first time) in staging, and not earlier in the life cycle. Secondly, there is no room for feedback in most cases. When product owners and customers communicate, there’s a common joke of “what the customer had in mind” and what the developer did, and how the sales person described it. But, if the first time everyone can see the feature is in staging: that’s always too late. The typical response is “it’s already in staging, let’s push what we have”. The next iteration (which, you never know when that will be) that’s where we’ll make that correction/fix. And finally, there’s the friction between the environments.
So, here’s the fun part. How many of you are already working with containers/docker? How many in production?
Containers, as I mentioned earlier, have fundamentally changed a lot of the things we can do as developers. It has reshaped the way that we build software, it makes the operation part much easier for us, and it helps us rethink the way we did things before.
I’ll start with the very basic impact of containers on CI/CD and our pipelines, and then I’ll get a little more complex. The very first and basic thing is that we can run every step of our CI inside a container. And, as you know, as we work with containers, each time we run a new container it’s a new fresh instance of that image. So, we are almost completely eliminating the chances of new builds being impacted by leftovers of previous builds. So, you’re isolating the builds, and giving a fresh container each time we re-run the build and compile the code.
The second aspect, is that the unit itself (that we move from one stage to the other) that we hand off to operations or deploy into production is a much more reliable and self-contained unit. So, when we know our Docker image has passed our unit tests/integration tests/UI tests/security tests/so on, the likelyhood of that Docker image to all of a sudden work completely different in production or staging is much less frequent than the chances of just moving our code from one stage to another. And that’s why Docker images are a much more complete way to describe our application and describe our code. It’s not only our code, but everything else we need to run it.
So, zooming out a bit from the Docker image itself: one of the strong drivers of images and containers is the adoption of microservices. Containers have been built from day one to support microservices. It has all the ways we can define the linkages between them, whether we’re using Docker compose, or Kubernetes, Mesos, etc.. All of these have slightly ways of defining how the application is working. But, in fact, containers allows us to define an application much easier with more than one container or microservice, which ultimately allows us to clone our staging environment much more easily. So, we can create a staging-like environment much more simply.
Now, the reason I’m saying “staging-like environment” and not staging environment, is because we still have a rule that our staging environment as identical as possible to what we have in production. And that’s because there are certain things we still want to use the staging environment for. For example, let’s say I’m running a retail website and I want to make sure on Black Friday I can support 100 thousand concurrent users. So I’m going to build a staging server that is as scalable as our production environment, and I can test that on our staging environment. But for many other testing types (UI tests, integration, etc), we don’t necessarily need the scalability of what we have in production. We just need to have all of the pieces of the application, so we can work and test our features. So microservices allow us to clone much earlier in the lifecycle (with a staging-like environment).
And with that staging-like environment, reiterating what I mentioned before all those frustrations about not being able to implement the feedback I’m given- if I’m able to implement that feedback much earlier, I can actually implement it then. So, that way, when it goes to staging, my feature works how I intended it to. There’s nothing holding me back from implementing them. By having this earlier in the process, I can share that environment with customers, product owners, and other stake holders, and I can also start running a much broader set of tests.
So, you may be asking, “we can’t really have an environment exactly like the staging environment, how I can test performance much earlier?”. The typical way we do this (and do this with our customers at Codefresh), is that you’ll test for the SLA that you have for your application in your staging environment, since that’s an environment that is as identical to what you have in production and allows you to test your SLA. But you can also run your tests much earlier in the life cycle on a less scalable environment, and this will allow you to track the trends. So, if you test your performance earlier and you see over time that performance is slowing down and not going up, that can be a way to trigger a reason to go back and revisit what you did (and tackle it much earlier in the life cycle).
If we go back and look at our staging environment and we see the sets of tests we did for the first time in staging, we see that all of these can be shifted to the left. There is no reason for us to stick to the way we’ve implemented CI and have the staging environment be the first time we’re testing our code for these things. The more we shift these to the left, the faster we identify these issues, the less bottlenecks we have in production, the more streamlined our pipeline is, and the faster we can push our code changes to production.
Last but not least, when we look at our Docker image here, it’s a great revised way of looking at what our deliverables are. We’re handing off a Docker image that is much more self-contained. If an issue is found in a Docker image (vs. a branch), if I re-run the same image with all the other microservices it’s much easier to reproduce it and tackle it.
This is the same gitflow from before. We have the feature branch, master and staging. But if we’re working on the feature branch there is nothing that’s holding us from provisioning a node and then do that in an automated fashion with Docker Compose up (or other great open source projects out there). If you’re unfamiliar with Docker compose, it’s one way of zooming out from a single microservice and lets you define more than one microservice and volumes and networks between them. It’s an abstract way of defining an application and allows you to get instances of your application earlier and on demand. Docker compose and Docker Swarm are great ways to scale your application. Kubernetes is also a great way to run containers and to have different ways of describing your application. A lot of these technologies allow you to describe your application is running, and much earlier in the lifecycle a running environment of my feature branch with everything else around it. I can send a link to my team, I can get feedback and I can incorporate it much earlier.
And we can do that in an automatic way! So not only can a developer spin it up and share it with a team, we can embed it into our continuous integration script, and spin up an environment, run our integration tests, or unit tests (with other services), then shut down the environment. So we can parallelize integration tests, unit tests with other services, etc. We’ll still have our last check at staging, but the likelihood of finding issues is much much smaller since we’ve done all these checks before.
So next, since I mentioned some Docker container technologies and orchestrations. You can go and implement it all yourself (maybe some of you have, or are currently working on it). This can be implemented using these technologies. You’ll need the scripts to provision the nodes of the Docker Daemon, or clusters on Kubernetes and monitor them. But I’m going to show you that when you build a product how this can be implemented. You can think of it as as a layer on top of this, and show you how we can set up a pipeline, run the steps in the pipeline and from the steps in that pipeline, spin-up the app, run the unit test, and shut it down. This will give us the ability to simultaneously run complex tests and run them on every build, and not just as part of the staging environment.