3. The Context
• A large library of CPAN distributions
- In a local::lib style dir .../cpan-5.008/{man,bin,lib}/
- Installed over many years
- No external record of what has been installed
- Almost 5000 modules
- In production in many systems on many machines
4. The Itch
• Want to upgrade from perl 5.8
- so need to clone our local library of CPAN modules
- to .../cpan-5.012/{man,bin,lib}/
- with recompiled perl extensions
• Want the exact set of distribution versions
- so when testing “nothing but perl changed”
6. Innocence and Hope
• Vague memory of something called ‘packlists’
• Vague memory of perllocal.pod install log
• Vague memory of some work by brian d foy
• Usual hope that someone’s already done this
• “How hard can it be?”
7. /.packlist
• Records only what files were installed
• Doesn’t record the origin distribution
• Useless for my needs
8. what_dists.pl
• Chris Williams’s github.com/bingos/throwaway
• Matches installed modules to distributions
• Only matches to the latest distributions
• Looked like a good place to start
• I hacked it to use perllocal.pod data and a
bunch of heuristics.
• It worked, mostly. Annoying edge cases.
• Lots of hacks, heuristics, and blind luck.
9. perllocal.pod
• Records a “name” and “version”
• Name is the Makefile.PL NAME
- can be the module or distribution name
- or something else entirely
• Version is the Makefile.PL VERSION
- not always the version in the distribution filename
• Incomplete!
- Not written by Module::Build based distributions
10. BackPAN::Version::Discover
• “Figure out exactly which dist versions you
have installed”
• Based on BackPAN::Index
• Incomplete and “very alpha”
• Matching logic not very robust
• Just doesn’t work very well for us
11. DPAN
• “start with an existing Perl distribution and
work backward to the MiniCPAN that would
re-install the same thing” - brian d foy
• Indexes MD5 and other metadata for all
BackPAN modules and scripts
• Incomplete: doesn’t yet work out what
distribution versions are installed.
12. GitPAN
• Git repo for every distribution on CPAN
• Includes all distro versions on BackPAN
• Pondered using git hashes and the github API
• But GitPAN isn’t being maintained
15. MetaCPAN
• Repository for CPAN metadata
- ElasticSearch distributed database (Lucene)
- RESTful API
• CPAN and entire BackPAN fully indexed
• Very detailed metadata
• Full Of Awesome
16. MetaCPAN
• Find all releases that contain a particular
version of a module:
curl -XPOST api.metacpan.org/v0/file/_search -d '{
"query": { "filtered":{
"query":{"match_all":{}},
"filter":{"and":[
{"term":{"file.module.name":"DBI::Profile"}},
{"term":{"file.module.version":"2.014123"}}
]}
}},
"fields":["release"]
}'
18. The Method
• Get installed module names, versions, file sizes
• For every module:
- find “candidate distributions” that included that
module version, ideally also matching the file size.
• For every candidate distribution:
- get all modules and versions shipped in that distro
- score each candidate by the proportion of its
modules and versions which match what’s installed
21. Cloning From The List
• Can’t simply feed results to cpanm
- It’ll fetch the latest version of any prereqs
• Tried to put the list in dependancy order
• Tried to use MiniCPAN::Inject
• Finally added a --makecpan dir option
- Fetches distro tarballs and writes index
- can be used as CPAN repo by cpanm
22. Typical Usage
Survey what distributions are installed in a library:
$ dist_surveyor.pl --makecpan my_cpan
/a/perl/lib/dir > installed_dists.txt
Install exactly those distributions in a new library:
$ cpanm --mirror file:$PWD/my_cpan --mirror-only
-l new_lib < installed_dists.txt
Bonus: re-tests all distros with current prereqs
23. Status
• Currently a single script
• Ought to be turned into a module
• Looking for a maintainer