4. Preservation for digital materials –
A. Heritage is obligatory for social continuity.
B. Maintenance of assets is business requisite.
5. The Library of Alexandria continues to be a platform for
democracy and has, since its inception opened its doors to
all Egyptians to participate in open dialogues about
reform […]. …
I have no doubt that once life returns to its norms, the
Library of Alexandria will continue its peaceful dialogue
and civic responsibility as an institution of learning.
Ismail Serageldin (via telephone)
Director of the Library of Alexandria
29 January 2011
6.
7. • Myriads of ebook formats
• Digital rights management
• No national policy in place
No one is in charge of the preservation of our
growing cultural heritage in digital books.
8. Digital storage means it is easy to preserve in
multiple locations, in different formats. More
redundancy than paper could ever afford.
Storage costs trending toward insignificant
on a per-byte basis.
10. A book is a complex assembly of content –
Text, video, foreword, illustrations, photos.
Each item needs careful management.
11. Digital production workflow requires:
- content management system
- sophisticated rights / use tracking
- skilled personnel esp. engineering
12. Possibility of loss should motivate preservation.
Loss can arise for oneself – or one’s partners –
via –
1. complex, tightly coupled systems
2. struggle of memory vs. forgetting
13.
14. From an engineering perspective, complexity
in work flow systems raises the chance of
catastrophic loss. Interactions become
“tightly-coupled” as system efficiency
increases.
Tightly coupled systems are prone to routine
accidents with unforeseen cascading effects.
15. - Charles Perrow
Sociology Dept, Yale University
Ex.: Apollo 13
“[The] accident was not the result of a chance
malfunction in a statistical sense, but rather
resulted from an unusual combination of
mistakes, coupled with a somewhat deficient
and unforgiving design.” source: Wikipedia.
16. Traditional (print) publishers are constructing
baroque and complex systems to maintain
heritage production while transitioning to
newer business systems.
Potential trigger events are ubiquitous.
17. “We're working quickly to recover from a major
issue in one of our database clusters. We're
incredibly sorry for the inconvenience.”
18. :
“Unfortunately, I have mixed up the accounts and
accidentally deleted yours. I am terribly sorry for
this grave error and hope that this mistake can be
reconciled. ” …
“Our teams are currently working hard to try to
restore the contents of this user's account. We are
working on a process that would allow us to easily
restore deleted accounts and we plan on rolling this
functionality out soon.” (em. added)
19.
20. Many archives are lost by simply forgetting
where or how they were stored. Canisters of
film, shelves of books, or backup tapes.
“Those books are in a warehouse in Jersey.”
“The servers are in one of our datacenters.”
21. Commonly, companies get acquired or
shutter business operations; records are
abandoned.
Servers are redeployed without an audit.
Databases run with no backup routines.
22. Preserving without metadata is not helpful.
Metadata provides necessary context.
Need to point backwards (provenance) and
forward (to find superseded content)
Without necessary metadata infrastructure,
preservation architectures are rickety.
23. If you are a publisher, do you safeguard your
digital assets? Has that process been audited
against security threats? Technical accidents?
Are your workflows well understood? Have
you conducted trial asset recovery exercises?
Does your insurance company know?
Do you mind if I give them a call?
24.
25. For 100s of years, libraries preserved books.
Sort of anyway (witness Alexandria).
By-product of the inefficiency of distribution:
Lots of libraries wound up with same books.
Lots of copies keep stuff safe.
26. Books were once bespoke – monks with pens.
Over around 500 years, books became
industrialized – mass produced. Although
ebooks are the ultimate industrial product,
we come full circle.
Digital books stand on threshold of
tremendous mutability.
27. It is easy to store objects now, arguably, but …
It is also easy to save objects that cannot be
easily recovered.
E.g., PDF can be a stew of non standard
formats and arbitrary data. EPUB with DRM
locks data with risk that key will be lost.
28. We may be able to preserve digital editions.
Can we preserve the book of the future?
• Scripted and Interactive
• Networked and Distributed
• Personalized and Mobile
29.
30. In U.S., beyond public and academic library
collections, the Library of Congress plays a
unique role:
§407. The Required copies or phonorecords shall
be deposited in the Copyright Office for the use
or disposition of the Library of Congress.
31. Current U.S. code privileges print as best
edition for preservation. Slowly being
changed.
(Congress approved e-journal demand
deposit for LoC in the 2010 legislative
session).
32. The Copyright Office historically held that
transmission to the LoC of deposited digital
books would be an infringing faithful copy.
despite –
§ 704. In the case of published works, all copies,
phonorecords, and identifying material
deposited are available to the Library of
Congress for its collections ….
33.
34. I led an initiative to confront this issue in
Summer 2008. Fizzled.
Convened meeting in NYC via Digital Library
Federation with LoC, Mellon, Portico, NISO,
IDPF, BISG, and publisher consultants and
representatives to discuss digital archives
that would also serve as escrow.
35. The group decided that we should attempt a
trial project with a small set of publishers who
would deposit sample books into Portico’s
repository; the pilot would inform against
business, legal, and policy issues.
With that in hand, LoC would be in stronger
position to solicit rule changes by Congress.
36. Portico is a not for profit jointly representing
publishers and libraries preserving a growing
volume of digital content with high reliability in a
rights respecting, secure archive. Serving as an
escrow, access to the archive by its members is
triggered only under carefully-defined and
contractually-specified circumstances.
Initial funding came from Mellon and LoC.
37.
38. The AAP evidenced support.
Ed McCoyd, Director of Digital Policy:
“Thank you for speaking at the AAP Digital Issues
Working Group meeting on [11 June 2008],
regarding your interest in bringing parties together
to develop digital book archives for preservation.
39. “Your points about preserving cultural patrimony,
and assuring permanent access by libraries and other
digital content customers as well as by the publishers
themselves, certainly resonated with the group.
“I look forward to talking further about your initiative
in the coming months, and will keep the publishers
apprised of the additional details as they develop.”
40. It fizzled due to the inability of the LoC to
determine whether it had the capacity to
pursue a deposit initiative, and if so, what
department of the Library should proceed.
41.
42. Portico decided to continue the pursuit of
ebooks deposits on behalf of its members.
Members are primarily research libraries –
‘cuz, Who else has a mandate to care about
preservation?
With inevitable focus on academic ebook
publishers, e.g. Elsevier, starting in 2008.
43. Portico has been unable to penetrate trade
publishing sector through private initiative.
No national policy requiring digital deposit.
44.
45. Google Book Search (GBS) has emerged in
the absence of international rulemaking as
the default archive for some institutions.
GBS does not do preservation quality
imaging, and there is requirement for
comprehensive publisher participation.
This is NOT a good solution.
46. Europeana digital library seeking to build a
collection preserving Europe’s vast cultural
heritage.
In September 2010,Ghent Univ. Library
became first in Europe to deposit public
domain books scanned by Google into
Europeana.
47. GBS partners’ collection management of
older public domain books is a pale shadow of
the comprehensive international policy
framework needed to mandate preservation
of our cultural heritage in digital books.
Preservation must mandate participation for
libraries and publishers in a legal framework.
48. Widespread recognition of need for new
copyright regime supporting digital use.
Internationally interoperable network of
rights assertions in rights registries capable of
automated status query in geographic and
national domains is a very useful support.
49.
50. Mandating the deposit of digital works for
preservation purposes to ensure adequate
representation of copyright assertions for a
content manifest would be one approach.
Arguably avoids Berne / TRIPS.
51. Most publishers are demonstrably willing to
engage in discussion of efforts supporting the
development of persistent heritage archives.
Requires hard political work conjoining some
subset of copyright policies, the imperatives
of cultural heritage, technical architecture.
52. If nothing is done, and we cannot solve this
problem for the simple digital books of the
20th Century, what are we going to do with
the books of the 21st Century?
53.
54. peter brantley
director, bookserver project
internet archive
san francisco, ca
(twitter) @naypinya (slideshare)
peter at archive.org