Finding out about the preservation of e-journals: the PEPRS Project
1. Finding out about the preservation of
e-journals: the PEPRS Project
Piloting an E-journals Preservation Registry Service
Fred Guy, Project Manager, EDINA,
University of Edinburgh
f.guy@ed.ac.uk
Internet Librarian International Conference 2010
15th
October 2010
2. 2
So What’s the Problem with E-journals?
• 96.1% of Science journals are online
• 86.5% of Arts and Humanities are online
• 2006-2007 – 102,000,000 downloads
– Up 21% from previous year
• 17% usage is at the weekend
Source. E-journals: their use, value and impact.
Research Information Network. UK April 2009.
3. Trends in the finances of UK higher education libraries: 1999-2009. RIN 2010.
p. 17
4. 5
Why Worry About Digital Preservation?
• Worries that all that is now digital may not always
be available, for a variety of reasons.
* Publishers ceases publication with no transfer
* Publisher goes out of business with no transfer
* Publisher taken over
5. 6
Legal Deposit
• Works well with print via legislation and national
libraries.
• Countries with legislation enacted (or ‘in train’)
for e-materials include: Canada, Denmark,
Finland, France, Germany, Iceland, New
Zealand, Norway, South Africa, Sweden, UK
• But, not all countries (notably USA) and in UK
the legislation supports voluntary deposit, with
restrictions of mode of access
6. 7
Why a Preservation Registry?
• Many schemes emerging to meet challenge
• But who is doing what?
– How can libraries & policy-makers assess which e-
journals are being archived, by what methods,
and under what terms of access?
• JISC commissioned a scoping study for an
e-journals preservation registry
– the idea had been mentioned in the literature
8. 9
Scoping Study Report Precedes PEPRS
• Rightscom / Loughborough University, 2007
– Confirmed expressed need among libraries and
policy makers
– Warned of potential burden on digital
preservation agencies
– Recommended:
* an e-journals preservation registry should be built
* UK Union Catalogue of Serials (SUNCAT)
or SHERPA (Open Access) get involved
– SUNCAT is hosted and managed at EDINA
9. PROJECT DETAILS
• Phase 1 funded by JISC (Preservation
Programme) from August 2008 – July 2010
• EDINA, University of Edinburgh, grant
recipient
• Project partner – ISSN International Centre,
Paris
• Evaluation carried out by Charles Beagrie
Limited for the JISC in February 2010
10. 11
Digital Preservation Agencies in the Pilot
* Two 3rd
Party Organisations
– CLOCKSS (Controlled Lots Of Copies Keeps Stuff Safe)
– Portico
* Two National Libraries (c.f. legal deposit)
– British Library (BL)
British Library e-Journal Digital Archive
– Koninklijke Bibliotheek (KB e-Depot)
KB, National Library of the Netherlands
* One library cooperative
– UK LOCKSS (Lots Of Copies Keeps Stuff Safe)
Alliance
11. Data from the agencies
e-Depot
XML
e-Depot
XML
UKLOCKSS
sourceforge.net
+ spreadsheet
CLOCKSS
sourceforge.net
+ spreadsheet
Portico
spreadsheet
Perl script to
parse
the data
ISSN
Register
PEPRS
Database
12. ISSN Register - steps
• Step 1. Extract a record for each record
from an agency
• Step 2. Take the ISSN-L from each record
• Step 3. Parse the Register to map from the
ISSN-L to the associated ISSNs
• Step 4. Load the records into a PEPRS
database and link using the ISSN-L to the
table with the records from the agencies.
13. 14
Example: a search on ISSN*
‘International Journal of High
Performance Computing Applications’
* ISSN-L is used within the system to
allow entry of either e-ISSN or p-ISSN
17. Issues identified in Phase 1
• ISSNs used by agencies
• Holdings information supplied by the
agencies
• Vocabulary used by the agencies
18. ISSN issues
• ISSNs missing in some agency records and
some not in ISSN Register
• Some duplicate records
• Some p-ISSNs used as e-ISSNs
• Some p-ISSNs linked via a common ISSN-L
to a number of e-ISSNs but which one is
correct?
• Some were incorrect
19. Holdings information - variation
e-Depot: Preserved: v. 1 - 36, 38 - 46.
UK LOCKSS Alliance: Preserved: v. 42 -
45. In progress: v. 46, 47.
Portico: Preserved: (2002-2009) v.40, v.41,
v.42, v.43, v.44, v.45, v.46, v.47.
20. Terms used by preservation agencies
CLOCKSS LOCKSS e-Depot Portico BL
No action: The agency has no relationship
with this title at present. (or is it best simply
not to mention your agency with regard to
this title?)
√ √ √
Committed: the publisher has agreed that
the agency may preserve the title but the
ingest process has not yet begun.
√ √ √(with
qualifications)
Queued: Publisher technical work is
complete, but the preservation agency has not
yet processed the title
√ √ (use term
with different
meaning)
Archived: The title has been ingested into
the archive
√ √ √ √
Available for Library Archiving: The title
has been made available for preservation bya
library, subject to a library’s subscription
rights.
√
21. Key recommendations from evaluation carried
out in February 2010
• Should be funding for 2 further years with an
initial 6 month phase and then if reviewed
successfully for another 18 months
• Need to resolve with the agencies currency
and updating of agency statements,
archiving status and fields and terms to use
in display.
• Continue with the development platform until
the end of 2010
• Establish a governance structure
22. PEPRS Phase 2
• Funding provided from August 2010 – July
2012
• Beta service – late 2010
• Full service – late 2011?
• Involve international users in testing
26
23. PEPRS Phase 2: key stages
ACTIVITY Aug-10 Dec-10 Apr-11 Aug-11 Dec-11 Apr-12 Aug-12 Dec-12
Set up team of
users
User testing
and feedback
Beta service -
preparation
Beta service-
operation
Full service
?
Advisory
group on
governance
Governance in
operation
24. Involvement with international initiatives
• Print Archives Program of the Center for Research
Libraries – “CRL is working with consortial partners to plan a
prototype print archives framework to link existing print archiving
efforts. has developed a searchable Print Archives Registry of
information about print-archiving initiatives, including:
– Projects
– Serial Holdings.
• HATHITrust – “….is committed to preserving the intellectual
content and in many cases the exact appearance and layout of
materials digitized for deposit. HathiTrust stores and preserves
metadata detailing the sequence of files for the digital object”.
25. PEPRS: Further information and
Contact details
http://edina.ac.uk/projects/peprs/index.html
Fred Guy, EDINA, University of Edinburgh
f.guy@ed.ac.uk
Notes de l'éditeur
Many preservation schemes emerging, including:
Collaborative and Third Party Schemes: CLOCKSS & Portico
National Libraries & Legal Deposit
Libraries and groups of libraries: LOCKSS Alliance
Many preservation schemes emerging, including:
Collaborative and Third Party Schemes: CLOCKSS & Portico
National Libraries & Legal Deposit
Libraries and groups of libraries: LOCKSS Alliance
Many preservation schemes emerging, including:
Collaborative and Third Party Schemes: CLOCKSS & Portico
National Libraries & Legal Deposit
Libraries and groups of libraries: LOCKSS Alliance