Version Uncontrolled - How to Manage Your Version Control (whitepaper)

COLLABORATE 15 – IOUG Forum
Middleware
1 | P a g e “Version Uncontrolled!”
White Paper
Version Uncontrolled! : How to Manage Your Version Control
Harold Dost III, Raastech
ABSTRACT
Are you constantly wondering what is in your production
environment? Do you have any doubts about what code is
running? Chances are that your version control software
isn't the problem. No matter what system you're using
there is the ideal and there is the practice. This white paper
helps you get on the right track by covering topics that
include the types of version control systems, merging and
branching techniques, and methods to get a solid source
control workflow into place. The latter half discusses the
organization of version control and how that can help with
enhancing confidence in your code base.
TARGET AUDIENCE
The intended readers of this paper are those involved with
development or all experience levels. Especially those who
are looking to drive up the quality of their production
environments.
EXECUTIVE SUMMARY
Reader will learn:
 Differences between kinds of source control
 Basic usage of SVN and Git
 Version control release procedures
BACKGROUND
Whether working in IT, supporting corporate systems or
developing a new product for a customer; the most
common tool of any department is the version control
system. It is sometimes referred to as source control or
even the source repositories, but whatever name that a
department uses for it there are many who simply misuse
and under use the available features. This failure to use
source control systems to their fullest extent has a number
of implications. Some have never heard of version control
until years into development and when it was introduced
to them they view it as a necessary evil. Some simply do
not know how to use it, because they either never had the
time or interest to learn beyond what few commands they
were taught. Others may just feel that the use of some of
the features is futile. Regardless of the reasons
departments are underusing their version control systems,
the benefits of utilizing them more completely can have
drastic improvements on the visibility and efficiency of the
development process.
Some of the key features can be used to save both anguish
and time of the employees. Combining these features with
a thought out release process can provide confidence that:
a team knows where changes are, when changes were
made, when they got to production, and who made them.
This kind of visibility goes very far when considering
global multi-team efforts on down to two person teams or
even one-person pet projects.
TECHNICAL DISCUSSIONS AND EXAMPLES
RELEASE
Before diving into the technical details, I would like to
place the disclaimer that I tend to be biased towards Git
SCM (Source Code Management) as I find it to be an
incredibly powerful tool. This has been based off my
personal experiences majorly consisting of SVN and Git.
At many companies the typical release process will looks
something like this:
When the developer creates code this is either due to a
new feature, maybe a bug fix, some performance
enhancement, etc. When creating this code the developer,
hopefully is writing tests not just in a word document, but
in the form of code verifying the inputs and outputs of the
code are what is expected. Once the developer is satisfied
with the code, it will should get reviewed and tested by a
3rd party. This helps with reducing inefficiencies, protect
from backdoors to an extent, and maintaining code styling.
After that, the code may be tested in a secondary
environment. For the purposes of this paper we will later
look at what I will call "binary-based" and "environment-
based". Binary-based would be something like the Linux
Kernel project where there's no one specific place that it
will run in the end. Environment-based would be an IT
infrastructure where code often times needs to be migrated
up through environments. For example, moving code
from development into test and finally production. Even
though in many cases there are binaries involved the
procedures may be a bit different. Now even though the
end products may end up being released as a binary or
pushed into a production environment the biggest
question we are asking here is around the source code.
Where is it? Unfortunately, in many cases the answer is still
always trunk. Having code in trunk is not bad, but only
having it there is bad. The advantage of having the code
only in trunk is that it's easy for developers. Make a change
commit, make another change and commit. They will do

Middleware
White Paper
this repeatedly until there is something ready to test and
eventually move it onto production. The problem is that
it's not simple to determine the last version in production.
Maybe there is an extra tool that keeps track, but then how
does that information get there? Even if you have such a
tool it places tracking outside of developer responsibility.
Using a version control system can solve this.
CENTRALIZED VS. DISTRIBUTED
Most if not all people in IT by now know what a version
control system is, but they may not realize how many
different one's there are out there and some of the
inherent differences between them. The major divide in
version control is its organization as a centralized system
or a distributed one.
A centralized version control system relies on a single
repository. Developers will check files out and check them
in by pushing changes to the repository. Certain files and
folders may be locked, to help keep files from being
changed while others are working on it. However, those
same locks can be overwritten in many cases giving certain
developers a false sense of security. On local systems
different branches simply appear as folders on the file
system and require having multiple copies of the same file
if the root level is checked out from the central repository.
Also, any commit history, or changes require contacting
the central repository. Some of the systems that are
classified as centralized include: Subversion (SVN), Team
Foundation Server (TFS), ClearCase, and Perforce (P4).
For the purposes of this paper Subversion will be used in
examples for centralized repositories.
In a distributed version control, system repositories are
local, meaning that each developer has everything they
need. Instead of needing to contact a server to commit
changes or check the commit history it's all performed on
the local repository. To propagate changes out to anyone
else's repository including a "blessed" repository, where
everyone would get their latest copy, patches are
exchanged. Also since a developer has a repository on their
machine, branching locally is cheap on disk space. While
the initial cloning of a repository may take a little time all
subsequent pushes/pulls, some of the distributed version
control systems include: Fossil, Mercurial, Git, and Bazaar.
For the purposes of this paper, Git will be used to
demonstrate distributed version control system.
USEFUL VERSION CONTROL COMMANDS
Now that we understand the differences between
centralized and distributed version control we can begin to
look at how they are similar. Over the next few paragraphs
we will be going through the commands of SVN and Git
to show the analogs between the two. Some commands
may not translate well between them, and as a result might
be only mentioned for a single tool.
CREATE A REPOSITORY
To begin we need to have a repository; for SVN this is no
simple task, as from the beginning it requires some central
place to start committing. For information on how to do
this look for the Subversion How To reference at the end of
this paper.
For Git simply, navigate to the directory where the project
should be initialized, and issue the following command:
git init
Now to share it with people it will require some initial
setup. For basic projects a shared network directory or
even a web-based shared directory could be used to host a
common source. If such a directory is used then it must
have a bare repository, which can be created using:
git init --bare test-repo.git
There are also more advanced instructions to create a
scalable, shared repository in How To Create a Remote Shared
Git Repository.
Once a repository is available to share amongst people it
must then be either checked out or cloned. For SVN, a
checkout is performed with the following command:
svn co http://host.com/path/repo
For Git, a clone is performed using:
git clone https://host.com/path/repo.git
Or:
git clone user@server:path/repo.git
The SVN and Git commands may take some time initially,
but for Git all further commits and log request will be near
instantaneous since the repository is on the local machine.
However, to push/pull changes to and from the remote
repository there are additional commands.
COMMIT CHANGES
To make changes in a repository they must be committed.
Since in SVN, there is only a single repository one must
only perform the following commands to add a file:
svn add test-file.txt
Push changes to the repository:
svn commit -m "Some Message"
This will push whatever changes have been made out to
the central repository. For Git, however, you have the
repository on your machine so to make changes locally
first add the files that should be included in the commit:
git add file-name.txt
Commit the changes:
git commit -m "Some Message"

Middleware
White Paper
Push them to the repository:
git push
While on either of these processes when pushing changes
out to the remote repository there may be merging
conflicts, but that is a bit beyond the scope of this section.
While the Git method may require a few more steps it
doesn't always. Once a file is being tracked, modified files
can be committed easily my just adding -a to the commit
command. The advantage of this is that a developer can
make changes to their repository multiple times and then
wait to push the changes out until they are final.
BRANCHING AND MERGING
The last feature we will cover is branching and merging, as
this will be important in the next section. To branch in
SVN:
svn copy trunk branches/my_branch
This command will copy all files locally essentially
doubling the disk space used by what was originally in
trunk. However to branch in Git:
git branch my-branch
OR
git checkout -b my-branch
The first Git command will make a branch based on the
current whereas the second will also take the user to that
branch. Both commands create symbolic links until a user
makes a change. It is important to note that from a user's
perspective, local branches in SVN are seen as different
directories in the file system. However, with Git, the tool
specifies which branch is active only needing to navigate to
a single directory. To know which branch a user is on:
git status
This will output:
On branch master
Your branch is up-to-date with
'origin/master'.
nothing to commit, working directory clean
To change branches use:
git checkout branch-name
Once a developer has made sufficient changes they should
now merge their code. To merge code in SVN, for this
example into trunk:
cd /path/to/repo/trunk
svn merge ../branches/my_branch/
svn commit -m "Merge my_branch"
For Git (assuming master is the current branch):
cd /path/to/repo
git merge my-branch
git push
WORKFLOWS
Now that we know how to use our different version
control systems we need to discuss good methods in using
version control.
In order to combat potentially hazardous behaviors, it is
recommended to setup a workflow and follow it. There are
many out there, but can be distilled down to a few.
Figure 1: Centralized Workflow
The first workflow is not recommended; it is unfortunately
very common amongst IT organizations despite better
workflows being known for many years now. It's the
centralized workflow. There is a single branch that
everyone commits their changes to and as a result it is a
mishmash of the various states of different sections of
code. This can slow down productivity since consecutive
commits may have nothing to do with each other. Tagging
can't really help here since everything in the singular
branch isn't guaranteed to be "production-ready".
Figure 2: Feature Branch Workflow
The next workflow improves things quite a bit; the Feature
Branch Workflow. The biggest feature is branches. By
adding these, a developer can create a branch from the
primary branch and work in their personal branch for
however long it takes to get the feature finished. A
I have observed trends that generally tend to be
common in shops using centralized repositories. Merges
are something to be feared. Commits may be only made
when absolutely necessary. Everything goes into trunk.

Middleware
White Paper
developer can have more than one personal branch and it
can be used for features, bugs, etc. Once they are complete
a merge can take place. When the merge happens depends
largely on the project and technology set. To use the Linux
Kernel again, the merge would need to take place first and
then a build can occur since whatever build tool is going to
be pulling from this "primary" branch. Whereas with
something where the project is self contained as long as
code can be checked out to be deployed or otherwise used
then the merge could happen after, but by doing it before
production is reflected by the version control. Tagging can
be very useful in this case, since only "production-ready"
code is placed into the primary branch.
The last approach presented here builds on the last and I
first discovered when reading an article by Vincent
Driessen. It involves low and behold, more branches! Just
as in the other there will be a "primary" branch, which for
the example will be called master. This should hold your
production code, and nothing else. The other constant
branch is "develop". The develop branch is used as the
intermediary where developers merge their features as they
are done, but before ready to be released. Other than the
personal branches there are other transient branches,
namely hotfix and release branches. Finally, tagging plays a
pivotal role in how this model works.
So just as with the previous workflow, developers will
checkout personal branches, and instead will branch from
develop. Once their feature is complete they will merge
back into develop. The only exception to this is if there is
need for a hotfix that can't wait for the next major release.
Assume the team is working on 1.5 and currently 1.4.1 has
been released. The developer responsible will create a
branch based on master, make the change and once it has
been successfully tested and is ready to go to production,
and the change will be merged into two places. The first is
master where the new version tag will be 1.4.2. The second
merge will be to develop so that code still in the develop
branch accounts for the change.
Finally, "release" branches are created from develop, and
are a staging area between develop and master. When the
release branch is created no new features are added to it,
but features can still be merged into develop. Bug fixes are
made directly to the release branch and can be merged
back into develop as often as desired. Once a release is
ready, it is merged into master and tagged. It is also
merged back into develop to reflect any bug fixes that had
not yet been merged.
Figure 3: Vincent Driessen Workflow
There may be variations of this depending on if a company
is supporting multiple versions of something at the same
time, but that would simply require checking out from a
specific tag and making necessary changes.
Other things that are not limited in any of these models
are:
1. How often things are pushed to production?
2. Who pushes to production?
3. What gets pushed to production?
4. Who controls what should go into a release?
The company implementing any of these workflows will
determine many of these things.
BRANCHING
So far this paper has discussed tools, commands, and
workflows. Of the three workflows, two of them
implement branching. The last one especially requires a
number of different branches. With all of these branches
flying around being created and destroyed one of the
things that will become quickly necessary is a naming
scheme. Some examples of naming schemes are:
 Feature – “feature/HAD/00001-some-new-
feature”

Middleware
White Paper
 Bug – “bug/HAD/010000-blue-screen-of-death-
is-red”
 Spike / Experimental
o “spike/HAD/radical-new-things”
o “exp/HAD/something-really-awesome”
 Release – “release/1.5” or “release/20150507.1”
Having the purpose behind branch is very useful so that
we can see outstanding bugs and features. It may be
desired to have developers working on a change to include
their initial as part of the branch name. This helps for
quick visual inspections of existing branches. Hopefully
companies are using some sort of system to track changes
and the number from that system should be used to have
some sort of correlation. Additionally, it's a good idea to
have a description of the branch so that people on don't
necessarily need to memorize issue numbers. Releases
should only require a version number, maybe including rc,
alpha, beta, etc.
GETTING THERE
Assuming that your team doesn't use a well-formed
workflow you probably want to move towards one of
these workflows. The steps towards using this on a team
level are relatively easy. First you'll need to set aside a little
time to think about how you can improve your personal
workflow and then move out into the team. Start by
practicing on a dummy project. Create a branch and go
through your typical process, but adding in the branching
concept and merging. Next, if your team is using the
centralized workflow, there is nothing stopping you
individually from branching for all of your assigned
features. Once you feel comfortable with how the process
works make a case for it at your next team meeting. If
there aren't regular meeting maybe ask to request one, and
if that doesn't work, then appeal to some of your co-
workers and show them how it helps you. It can grow
organically from you out to your co-workers. However to
take it to the next level will require team and sometimes
unit level cooperation. With your team on-board and the
benefits these workflows making the business case should
be simple.
APPENDICES
APPENDIX A: MIGRATION FROM GIT TO SVN
http://www.subgit.com/remote-book.html
APPENDIX B: ENTERPRISE GIT MANAGEMENT
TOOLS
https://enterprise.github.com/
https://www.atlassian.com/software/stash
http://www.gitenterprise.com/pricing.html
https://about.gitlab.com/pricing/
REFERENCES
Atlassian. (2015). Comparing Workflows. Retrieved 2015,
from Atlassian:
https://www.atlassian.com/git/tutorials/comparing-
workflows/
Bansal, N. (2011). HOWTO: Hosting a Subversion Repository.
Retrieved 2015, from University Of Toronto:
http://queens.db.toronto.edu/~nilesh/linux/subversion-
howto/
Driessen, V. (2010). A successful Git branching model.
Retrieved 2015, from Nvie: http://nvie.com/posts/a-
successful-git-branching-model/
Git SCM. (2014). Git Book. Retrieved 2015, from Git SCM:
http://git-scm.com/book/en/v2/
Kovshenin, K. (2011). How To Create a Remote Shared Git
Repository. Retrieved 2015, from Kovshenin:
https://kovshenin.com/2011/howto-remote-shared-git-
repository/
Tutorials Point. (2014). SVN Tutorial. Retrieved 2015,
from Tutorials Point:
http://www.tutorialspoint.com/svn/

Version Uncontrolled - How to Manage Your Version Control (whitepaper)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (18)

En vedette

En vedette (20)

Similaire à Version Uncontrolled - How to Manage Your Version Control (whitepaper)

Similaire à Version Uncontrolled - How to Manage Your Version Control (whitepaper) (20)

Plus de Revelation Technologies

Plus de Revelation Technologies (20)

Dernier

Dernier (20)

Version Uncontrolled - How to Manage Your Version Control (whitepaper)