Talk given on BalCCon 2013 by Vlatko Kosturjak: Wonderful world of (distributed) SCM or VCS. Ripping and extracting useful info from CVS, Subversion (SVN) and GIT repositories publicly exposed on the web.
2. Agenda
● Not covered
● Philosophical issues
● Finding code
● Old school SCM
● New school SCM
● How to get the source when its not open source
● Questions and Answers
75 minutes
3. Disclaimer
● This is a work of pure fiction
● Any resemblance to anyone, living or dead is purely
coincidental
● The characters are fictional and of my own creation
● The place, time and incidents are purely fictional
● I don't take any responsibilities for your actions, consider
yourself ethical and legal issues of your actions!
● Look closer - I'm also virtual! :)
4. That source control management is
really really great...
● Versioning
● Blame
● Undo
● Collaboration
● Code review
● Sign off
● Integration
● ...
6. First rule
● If sensitive
● Don't put source code on internet
● Don't put SCM files on the internet
● Don't put sensitive parts in web root
● Don't...
● Don't...
● Don't...
7. Search for specific phrase, file,
function or class
● Just google for it! ;)
● Internet does not forget! ;)
● Instructions
● Strings <binary>
● Google above
@alexsotirov on 4th of Jul 2010:
It's amazing what you can find on
random Chinese sites if you start
googling internal strings from closed-
source applications
8. How about configs in repos?
● Software.conf vs Software.conf-dist
● Software.conf
● More dangerous
● Danger of accidentaly commiting sensitive info
● Software.conf-dist
● Less dangerous
● Still watchout wildcards “*”
11. CVS
● Concurrent Versions System
● CVS
● Entries
● Entries.Log
● Repository
● Root
● Finding repository source
● Profit if it is Internet accessible
12.
13. What can be extracted?
● Artifacts
● Repository location
● Name of hidden files
– If present in repository
● Repository user
● Just enough for password guessing if online
19. SVN client 1.6+
● No more .svn directories all around
● Single .svn (just like git!)
● Different format
● Incompatible, of course ;)
● Different files
● wc.db – SQLite database
20. SVN client 1.6+ extraction
● Much easier
● Much faster
● Much robust
● No more problems extracting interpreted files
– Like PHP
● Thank you SVN developers! ;)
21. Protection
● Make it open source ;)
● Remove SCM files if not needed
● Web server configuration
● Web deployment automation controls
● ...
22. Apache (main configuration file)
● 403 – Forbidden – Move along nothing to see
<DirectoryMatch .svn>
Order allow,deny
Deny from all
</DirectoryMatch>
● 404 – Not found – Pick somewhere else
AliasMatch .svn /non-existant-page
23. Apache (.htaccess)
● Using mod_rewrite
RewriteEngine On
RewriteRule /.svn /non-existant-404-page
<IfModule autoindex_module>
IndexIgnore .svn
</IfModule>
29. Git: many ways...
● Find archive of SCM
● Bruteforce SHA1
● Bandwidth
● Time
● Partial SHA1 visible
● different files
● There must be the way...
30. Zombie mode on
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
I MUST GET THE SOURCE
…
31. DVCS-Pillage
● It will rip the .git files when directory browsing
disabled
● By Adam Baldwin
● Accessible from URL:
● https://github.com/evilpacket/DVCS-Pillage
● Have few problems
● Hmm...
32. Problems...
● Current methods
● Not complete tree download method
– Packed refs
– git ls-files –stage method
● No support for branches
● No support for other than http
● Slooow...
● Hmmm
● Want whole tree / files
● Branches
● Support old protocols
● Bruteforcing not feasable
33. Zombie mode on
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
I MUST GET THE FULL SOURCE
...
35. Solution is...
● RTFM
● git fsck
– it will tell what sha1 are missing
– No partial recovery
● Time to code my own tool
● Want whole tree
● Branches
● Support all protocols
● FAST!!
36. DVCS-rip
● It will rip the .git files when directory browsing disabled
● It will rip ALL files and checkout repository for you
● Not partial
● git fsck trick
● Support for
● Branches
● Any protocol (http/https/...)
● Accessible from URL:
● https://github.com/kost/dvcs-ripper
37. DVCS-rip
● How to run?
● Example run:
● rip-git.pl -v -u http://www.example.com/.git/
● It will automatically do "git checkout -f"
● Profit!
38. Protection
● Make it open source ;)
● Remove SCM files if not needed
● Web server configuration
● Web deployment automation controls
● ...
39. Apache (main configuration file)
● 403 – Forbidden – Move along nothing to see
<DirectoryMatch .git>
Order allow,deny
Deny from all
</DirectoryMatch>
● 404 – Not found – Pick somewhere else
AliasMatch .git /non-existant-page
40. Apache (.htaccess)
● Using mod_rewrite
RewriteEngine On
RewriteRule /.git /non-existant-404-page
<IfModule autoindex_module>
IndexIgnore .git
</IfModule>
41. How about others?
● Mercurial
● Bazaar
● Checkout DVCS-Pillage
● It will handle git, hg and bzr
● Accessible from URL:
– https://github.com/evilpacket/DVCS-Pillage
42. No tool available to detect
● Most of the web/network scanners will not find this
● No awareness
● Tools looks only this
● .git/ => 403
● They should actually look
● .git/logs/HEAD => 200
● .git/config => 200
● .git/index => 200
● ...
43. Nmap NSE comes to rescue
● Have to use latest Nmap version
● Script is not in 6.01
● It was broken in some previous Nmap versions
● It looks all relevant git files
● .git/logs/HEAD
● .git/config
● ...
● nmap -sS -PS80,81,443,8080,8081 -p80,81,443,8080,8081
--script=http-git <target>
PORT STATE SERVICE
80/tcp open http
| http-git:
| Potential Git repository found at XX.XX.XX.XX:XX/.git/ (found 5 of 6
expected files)
46. Google dorks
● “.git” intitle:”index of”
● “.svn” intitle:”index of”
● “CVS” intitle:”index of”
● “.hg” intitle:”index of”
● “.bzr” intitle:”index of”
● … (I guess you got idea already)...
47. Searching for standard interfaces
● Interfaces
● Redmine
● ViewCS
● ViewCVS
● Gitweb
● ...
● Google Dorks
● “Powered by ViewCS”
● Bing as well...
48. Recommendations for developers
● Do not store passwords and API keys on SCM
● Config.php vs config.php-dist
● Do not store sensitive info on SCM
● Separate test and production data
● Being paranoid is good feeling
49. Recommendations for system
administrators
● Proactively forbid serving all SCM files on web
servers
● Periodical check for standard directories of SCMs,
i.e.:
● find /web -name .svn
● find /web -name .git
● wget http://www.site.com/svn/
● Is there any need to have source code available at
all?
50. Recommendations for management
and auditors
● Ask how source code management is done
● Ask what security controls are there to protect
source code
● What controls are there to protect source code
leaks?
● What controls are there to protect passwords and
keys leaks?
● What controls are there to protect sensitive
information in source code and configurations?