TeamStation AI System Report LATAM IT Salaries 2024
Using Git as your VCS with Bioconductor
1. Using Git as your VCS with
Bioconductor
DVCS for fun, profit,* and free tools
*
Tim Yates
Software Architect
Applied Computational Biology and Bioinformatics
Paterson Institute
(* not a guarantee)
2. What I hope to show
• git
• git-svn
• an example project and workflow
3. What I hope to avoid
• Preaching too much
• An all-encompassing git tutorial
– http://book.git-scm.com/
– http://www-cs-students.stanford.edu/~blynn/gitmagic/
4. What is Git?
• Distributed
• Fast
• Cheap branches
• Local commits
• Push/Pull to multiple remote locations (ssh://
git:// etc://)
• Simple collaboration
• Github (killer app?) -- issue management, forking,
pull requests, wiki pages, inline comments
5. Why do I like it?
(or: what do I mostly use when working with git?)
• Don’t need to be connected (or even finished with a
feature) to commit my code into version control
• Can branch locally to try new things or new features
• All branches/trunk in one folder
• Inter-branch cherry-picking
• git stash (for switching when you’re half way through
something, and get an emergency)
• Can just commit small bits of files if you need to
• Amazing 3rd party tools
• Github (just beginning to use it, but can see this
increasing)
Basically, I find git just gets out of the way and lets you work on the code...
6. Introducing git-svn
• Acts as a bridge between the two
• Clone svn repository
• Work locally with Git
• Commit changes back to SVN
7. Clone a remote svn repository
git svn clone -s https://url reponame
• Would clone the whole repository, branches and all
• The -s parameter tells git we have a ‘standard’ /trunk
/branches svn repo structure (not the case with
bioconductor)
• In general the whole repository is too much
8. How far back do you want to go?
$ svn log --limit 4 https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xmapcore
------------------------------------------------------------------------
r59924 | d.tenenbaum | 2011-10-31 23:12:28 +0000 (Mon, 31 Oct 2011) | 1 line
version bump for start of 2.10 development cycle
------------------------------------------------------------------------
r59920 | d.tenenbaum | 2011-10-31 22:59:03 +0000 (Mon, 31 Oct 2011) | 1 line
bumped package version numbers prior to creating 2.9 branch
------------------------------------------------------------------------
r59218 | t.yates | 2011-10-14 12:51:06 +0100 (Fri, 14 Oct 2011) | 1 line
Up to v1.7.10 XMAPCORE-47 Check for annmap_ as db prefix
------------------------------------------------------------------------
| t.yates | 2011-08-10 09:09:48 +0100 (Wed, 10 Aug 2011) | 1 line
Try exporting all.chromosomes by name to see if we avoid the S3 warning
------------------------------------------------------------------------
9. Clone the svn repository locally with
git-svn
$ git svn clone -r 59218:HEAD
--trunk=trunk/madman/Rpacks/xmapcore
--branches="branches/*/madman/Rpacks/xmapcore"
https://hedgehog.fhcrc.org/bioconductor xmapcore
Initialized empty Git repository in /Users/tyates/Code/R/xmapcore/.git/
A .BBSoptions
… snip …
r59218 = f9edd025ca1f9afbaa1b91e2e608ba599464be0a (refs/remotes/trunk)
M DESCRIPTION
r59920 = 3fa20b8d69eb6763104e910db3cb10f83df91902 (refs/remotes/trunk)
Found possible branch point: https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xmapcore =>
https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_2_9/madman/Rpacks/xmapcore, 59921
Found branch parent: (refs/remotes/RELEASE_2_9) 3fa20b8d69eb6763104e910db3cb10f83df91902
Following parent with do_switch
Successfully followed parent
r59922 = f201f2dee6f7c565c612d149836f6f6e0252c8fd (refs/remotes/RELEASE_2_9)
M DESCRIPTION
r59924 = fa0b466921e154e8fb60e9ff3093e652adae095f (refs/remotes/trunk)
Checked out HEAD:
https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xmapcore r59924
From Sean Davis’ blog: http://watson.nci.nih.gov/~sdavis/blog/git-svn_for_bioconductor_repository/
10. This took 12 minutes to clone three
revisions to xmapcore!
$ git svn clone -r 59218:HEAD
--trunk=trunk/madman/Rpacks/xmapcore
--branches="branches/*/madman/Rpacks/xmapcore"
https://hedgehog.fhcrc.org/bioconductor xmapcore
Initialized empty Git repository in /Users/tyates/Code/R/xmapcore/.git/
A .BBSoptions
… snip …
r59218 = f9edd025ca1f9afbaa1b91e2e608ba599464be0a (refs/remotes/trunk)
M DESCRIPTION
r59920 = 3fa20b8d69eb6763104e910db3cb10f83df91902 (refs/remotes/trunk)
Found possible branch point: https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xmapcore =>
https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_2_9/madman/Rpacks/xmapcore, 59921
Found branch parent: (refs/remotes/RELEASE_2_9) 3fa20b8d69eb6763104e910db3cb10f83df91902
Following parent with do_switch
Successfully followed parent
r59922 = f201f2dee6f7c565c612d149836f6f6e0252c8fd (refs/remotes/RELEASE_2_9)
M DESCRIPTION
r59924 = fa0b466921e154e8fb60e9ff3093e652adae095f (refs/remotes/trunk)
Checked out HEAD:
https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xmapcore r59924
11. The two core git-svn commands
Check for remote svn changes
git svn rebase
• Rewinds your current un-svn committed git commits
• Fetch all upstream SVN commits from server
• Reapplies your git commits in order
• Same as ‘svn update’
12. Push local changes to remote svn
git svn dcommit
• Same as ‘svn commit’, but will use messages from your
local git commits.
13. When bioconductor updates
• Update your local git-svn repo with:
git svn fetch
• And simply track the new branch when a new
bioconductor release comes out
git checkout –b local_2_9 RELEASE_2_9
From Sean Davis’ blog: http://watson.nci.nih.gov/~sdavis/blog/git-svn_for_bioconductor_repository/
14. Example project rbioc2011
• A very basic bioconductor package
• Has one function `duplicate` which returns a
vector containing two copies of the parameter
`x`
• Contains unit tests
15. ...
Depends: R (>= 2.0), methods
Suggests: Runit
...
TOP=../..
PKG=${shell cd ${TOP};pwd}
SUITE=doRUnit.R
R=${R_HOME}/bin/R
all: inst test
inst: # Install package
cd ${TOP}/..;
${R} CMD INSTALL ${PKG}
test: # Run unit tests
export RCMDCHECK=FALSE;
export RUNITFILEPATTERN="$(file)";
export RUNITFUNCPATTERN="$(func)";
cd ${TOP}/tests;
${R} --vanilla --slave < ${SUITE}
# .setUp is called before each test method
.setUp = function() {}
# .tearDown is called after each test method
.tearDown = function() { }
# An example test
test.duplicate = function() {
a = duplicate( 'tim' )
checkEquals( a,
c( 'tim', 'tim' ),
"Should be the string tim repeated twice" )
}
duplicate = function( x ) {
c( x, x )
}
16. Two ways of running tests
$ cd inst/unitTests
$ export R_HOME=/usr
$ make
...snip...
rbioc2011 unit testing - 1 test functions, 0 errors, 0 failures
$ R CMD build rbioc2011
$ R CMD check rbioc2011_1.1.tar.gz
...snip...
* checking tests ...
Running ‘doRUnit.R’
OK
* checking for unstated dependencies in vignettes ... OK
...snip...
17. Demo
$ git svn clone -s https://rbioc2011.googlecode.com/svn rbioc2011
Initialized empty Git repository in /Users/tyates/Code/R/gitsvndemo/git/rbioc2011/.git/
r1 = 329badc22dbfda318ed91d56f70baf06d632ae52 (refs/remotes/trunk)
A R/duplicate.R
A R/zzz.R
A tests/doRUnit.R
A DESCRIPTION
A man/rbioc2011-package.Rd
A man/duplicate.Rd
A NAMESPACE
A inst/unitTests/runit.duplicate.R
A inst/unitTests/Makefile
A inst/doc/Rbioc2011.Rnw
r2 = 0e51d0aa048fc55b8bcb0191ef0437585e97bac9 (refs/remotes/trunk)
Found possible branch point: https://rbioc2011.googlecode.com/svn/trunk =>
https://rbioc2011.googlecode.com/svn/branches/RELEASE_2_9, 2
Found branch parent: (refs/remotes/RELEASE_2_9) 0e51d0aa048fc55b8bcb0191ef0437585e97bac9
Following parent with do_switch
Successfully followed parent
r3 = 93493a0306a46be55e8227557f0d7f7f241f16b7 (refs/remotes/RELEASE_2_9)
Checked out HEAD:
https://rbioc2011.googlecode.com/svn/trunk r2
18. Set up local branch per remote branch
you want to track
$ cd rbioc2011
(master) $
$ git checkout -b local_2_9 RELEASE_2_9
Switched to a new branch 'local_2_9’
(local_2_9) $
19. Get an issue in our tracker
this needs to go in devel and the 2.9 branch as it is not a functionality change
20. Workflow for changes to both
branches
• Edit the file on local_2_9 branch
• Commit the change to git (and svn)
• Checkout the master branch
• Cherry-pick the commit into master
• Commit the master branch to svn
21. (local_2_9) $ pico man/duplicate.Rd
(local_2_9*) $ git add man/duplicate.Rd
(local_2_9*) $ git commit -m "RBIOC-1 man page for duplicate has suplicate in the title"
[local_2_9 c7183c0] RBIOC-1 man page for duplicate has suplicate in the title
1 files changed, 1 insertions(+), 1 deletions(-)
(local_2_9) $ git svn rebase
Current branch local_2_9 is up to date.
(local_2_9) $ git svn dcommit
Committing to https://rbioc2011.googlecode.com/svn/branches/RELEASE_2_9 ...
Authentication realm: <https://rbioc2011.googlecode.com:443> Google Code Subversion Repository
Password for 'tim.yates@gmail.com':
M man/duplicate.Rd
Committed r5
M man/duplicate.Rd
r5 = 779f6daf2495fe7e83156e3d70ee627217a30da9 (refs/remotes/RELEASE_2_9)
No changes between current HEAD and refs/remotes/RELEASE_2_9
Resetting to the latest refs/remotes/RELEASE_2_9
(local_2_9) $ git checkout master
Switched to branch 'master’
(master) $ git cherry-pick c7183c0
[master 198cacb] RBIOC-1 man page for duplicate has suplicate in the title
1 files changed, 1 insertions(+), 1 deletions(-)
(master) $ git svn rebase
Current branch master is up to date.
(master) $ git svn dcommit
Committing to https://rbioc2011.googlecode.com/svn/trunk ...
M man/duplicate.Rd
Committed r6
M man/duplicate.Rd
r6 = a403a62a4d112f98f2e647ce7de619a53bfd1529 (refs/remotes/trunk)
No changes between current HEAD and refs/remotes/trunk
Resetting to the latest refs/remotes/trunk
22. Get an issue in our tracker
this just goes in devel, as it adds new functionality
23. Create a new branch off of master
(trunk) for the changes
(master) $ git checkout -b RBIOC-2
Switched to a new branch 'RBIOC-2’
(RBIOC-2) $
24. Add a test for our new functionality,
and commit
(RBIOC-2) $ pico inst/unitTests/runit-duplicate.R
# .setUp is called before each test method
.setUp = function() {}
# .tearDown is called after each test method
.tearDown = function() { }
# An example test
test.duplicate = function() {
a = duplicate( 'tim' )
checkEquals( a, c( 'tim', 'tim' ), "Should contain the string tim repeated twice" )
}
# Test for RBIOC-2
test.RBIOC2.duplicate = function() {
a = duplicate( 'tim', 3 )
checkEquals( a, c( 'tim', 'tim', 'tim' ), "Should be three tims after RBIOC-2" )
}
(RBIOC-2*) $ git commit -a -m "Added unit test for RBIOC-2”
[RBIOC-2 401c542] Added unit test for RBIOC-2
1 files changed, 7 insertions(+), 0 deletions(-)
25. Check our test fails
(RBIOC-2) $ cd inst/unitTests
(RBIOC-2) $ make
------------------- UNIT TEST SUMMARY ---------------------
RUNIT TEST PROTOCOL -- Tue Nov 29 14:29:07 2011
***********************************************
Number of test functions: 2
Number of errors: 1
Number of failures: 0
1 Test Suite :
rbioc2011 unit testing - 2 test functions, 1 error, 0 failures
ERROR in test.RBIOC2.duplicate: Error in duplicate("tim", 3) : unused argument(s) (3)
Error:
unit testing failed (#test failures: 0, #R errors: 1)
Execution halted
make: *** [test] Error 1
26. Write the code to make the test pass,
and commit
(RBIOC-2) $ pico R/duplicate.R
duplicate = function( x )n=2 ) {
x, {
rep( x
c( x, x, )n )
}
(RBIOC-2*) $ git commit -a -m "Added code for duplicate x, n RBIOC-2"
[RBIOC-2 8f8a4b8] Added code for duplicate x, n RBIOC-2
1 files changed, 2 insertions(+), 2 deletions(-)
27. Check our test passes
(RBIOC-2*) $ cd inst/unitTests
(RBIOC-2*) $ make
------------------- UNIT TEST SUMMARY ---------------------
RUNIT TEST PROTOCOL -- Tue Nov 29 14:43:02 2011
***********************************************
Number of test functions: 2
Number of errors: 0
Number of failures: 0
1 Test Suite :
rbioc2011 unit testing - 2 test functions, 0 errors, 0 failures
28. Write the documentation, and commit
(RBIOC-2) $ pico man/duplicate.Rd
...
usage{
duplicate( x )n=2 )
x,
}
arguments{
item{x}{ The object to be duplicated. }
} item{n}{ The number of times to duplicate code{x} }
}
details{
details{
code{duplicate} returns 2 of object code{x} in a vector.
} code{duplicate} returns code{n} of object code{x} in a vector.
}
...
...
(RBIOC-2*) $ git commit -a -m "Added documentation for new function RBIOC-2"
[RBIOC-2 d77d8b0] Added documentation for new function RBIOC-2
1 files changed, 3 insertions(+), 2 deletions(-)
29. Merge into master in an “svn friendly”
way
• Svn is not a ‘commit often’ VCS
• Broken commits are a Bad Thing
• 2 Options:
git merge git rebase -i HEAD~3 git merge --no-ff --log
Picture Credit Vincent Driessen
http://nvie.com/posts/a-successful-git-branching-model/
30. (RBIOC-2) $ git checkout master
Switched to branch 'master’
(master) $ git svn rebase
Current branch master is up to date.
(master) $ git merge --no-ff --log RBIOC-2
Merge made by recursive.
R/duplicate.R | 4 ++--
inst/unitTests/runit.duplicate.R | 7 +++++++
man/duplicate.Rd | 5 +++--
3 files changed, 12 insertions(+), 4 deletions(-)
(master) $ git svn dcommit
Committing to https://rbioc2011.googlecode.com/svn/trunk ...
M R/duplicate.R
M inst/unitTests/runit.duplicate.R
M man/duplicate.Rd
Committed r7
M R/duplicate.R
M man/duplicate.Rd
M inst/unitTests/runit.duplicate.R
r7 = 3a30f29de3ff1de46c4062daccfe08f8474ec951 (refs/remotes/trunk)
No changes between current HEAD and refs/remotes/trunk
Resetting to the latest refs/remotes/trunk
31. (master) $ git merge --no-ff --log RBIOC-2
Merge made by recursive.
R/duplicate.R | 4 ++--
inst/unitTests/runit.duplicate.R | 6 ++++++
man/duplicate.Rd | 5 +++--
3 files changed, 11 insertions(+), 4 deletions(-)
(master) $ git svn rebase
First, rewinding head to replay your work on top of it...
Applying: Added unit test for RBIOC-2
Applying: Added code for duplicate x, n RBIOC-2
Applying: Added documentation for new function RBIOC-2
(master) $ git rebase -i HEAD~3
[detached HEAD 1256efe] Merge for RBIOC-2 Unit test (master) $ git reset --hard HEAD~3
3 files changed, 11 insertions(+), 4 deletions(-) HEAD is now at 13f5fe5 RBIOC-1 man page for duplicate has
Successfully rebased and updated refs/heads/master. suplicate in the title
33. Take-home message
• Keep your SVN branches clean and “svn-ish”
• Unit tests tell you when you’re wrong
34. Q: Should we mirror Bioconductor
packages on github with an aim to
improving collaboration?
35. Resources
• Sean Davis blog on getting started with git-svn and Bioconductor
– http://watson.nci.nih.gov/~sdavis/blog/git-svn_for_bioconductor_repository/
• Demo SVN repository
– http://code.google.com/p/rbioc2011/
• Demo Github repository
– https://github.com/timyates/rbioc2011
• Git manuals
– http://book.git-scm.com/
– http://www-cs-students.stanford.edu/~blynn/gitmagic/
• Prompt customisation for bash/zsh/cygwin
– https://github.com/git/git/blob/master/contrib/completion/git-completion.bash
Editor's Notes
So, lets look at how we would fetch a repository from bioconductor
The three basic git-svn functions you’ll need
First, we need to decide how far back in tim we want to look when cloning our repository
Imagine if you wanted to go further backIt can very quickly become a overnight processI killed it a few times before leaving it one evening, and discovering it had worked the next morning
The three basic git-svn functions you’ll need
I’ve slightly customised the standard makefile and doRUnit.R fileSo I can define a file to run, and optionally a function pattern
When using git-svn, build first as check moans about .git folder