Presented by Sarah Grimm (Wisconsin Historical Society) and Emily Pfotenhauer (WiLS) for the Wisconsin Association of Academic Librarians (WAAL) conference, Elkhart Lake, Wisconsin, April 25, 2013. Content based on Modules 1 & 2 of the Digital Preservation Outreach and Education (DPOE) Baseline Digital Preservation Curriculum developed by the Library of Congress.
Philosophy of Education and Educational Philosophy
Managing Digital Content Over Time: Identify and Select
1. Managing Digital Content
Over Time
Sarah Grimm, WHS
Emily Pfotenhauer, WiLS
Slides and handouts:
recollectionwisconsin.org/waal2013
Supported by WHRAB
3. DPOE Mission
The mission of the Digital Preservation
Outreach and Education (DPOE)
program of the Library of Congress
is to encourage individuals
and organizations to actively
preserve their digital content, building
on a collaborative network of instructors,
contributors, and institutional partners.
4. Six Training Modules
Identify - what digital content do you have?
Select - what portion of that content will be
preserved?
Store - how should your content be stored
for the long term?
Protect - what steps are needed to protect
your digital content?
Manage - what provisions are needed for
long-term management?
Provide - how should your content be made
available over time?
5. What is Digital Content?
Digital content is any content that is
published or distributed in a digital
form, including text, data, sound
recordings , photographs and images,
motion pictures, and software.
◦ Digital materials created from analog
sources
◦ Born-digital content
Digital materials you currently have or
create – or expect to have – that you
want to preserve.
6. What’s the Problem?
Increasing amounts of digital assets
are arriving on our doorstep
The digital assets arrive in all formats
and on all formats
Time sensitive - the longer we wait or
the longer our donors wait the
increased chance that something will
be unreadable
7. Digital Reality in 2013
Everyone is
◦ creating digital content
◦ distributing digital content
◦ using digital content
And we are responsible for
managing digital content now or
expecting to in the near future
8. What are the Challenges?
Who takes the lead?
What can I do?
Where do I start?
The impediments
Too complex (I don’t understand...)
Too daunting (I don’t have time...)
Too technical, etc. (Computers scare me...)
10. Digital Preservation
Digital preservation combines policies,
strategies and actions to ensure
access to reformatted and born digital
content regardless of the challenges of
media failure and technological change.
The goal of digital preservation is the
accurate rendering of authenticated
content over time.
Working group on Defining Digital Preservation, ALA Annual Conference, 6/24/2007
11. Why Do We Identify Content?
Not all digital content can or should be
preserved
Preservation requires an explicit
commitment of resources
Good preservation decisions are based
on an understanding of the possible
content to be preserved
12. First Steps
• Identifying content is a first step to planning
for current and future preservation needs
• Ask: what content
do I have,
will I have,
might I have,
must I have?
An inventory is the best way to identify what
content you have now – and raise awareness
in your institution.
14. If not, do you need permission
to begin an inventory project?
15. Inventory Considerations
Inventory content more important than style
and format
Inventory results should be:
◦ Documented: an inventory should
actually exist
◦ Usable: use a simple format to sort, list,
etc.
◦ Available: accessible to others
◦ Scalable: content will be added during
Select
◦ Current: update periodically
16. Inventory Tips
Don’t let implementing the software
become the focus.
Use software you know and have
available
Stick with a single format; don't
change once you've decided on it.
Be consistent, comprehensive, and
concise
17. How Much Detail to Include
Inventories can be general to detailed
Determine appropriate level of detail for you
Factors in determining level of detail:
◦ Extent of content to be inventoried
◦ Nature & location of content
◦ Resources available to complete
inventory
◦ Timeframe & deadlines for completion
18. What Do You Have?
Identify collections of digital materials.
Provide a brief title and description
Estimate growth over time ***
19. Who Manages It?
Department – currently managing the
collection/digital content
Staff – primary people responsible
Creator (Internal or External) – who
created the digital content
20. What does it consist of?
Medium (6cds, 1 hard drive)
Extent = Format + Amount
(600 .pdfs, 30 .doc)
File Size – (MB, GB, TB)
http://www.csgnetwork.com/memconv.html
21. Date Considerations
Inventories should note:
• Date of inventory and updates to it
• Dates associated with the content
(18721901)
• Date of files – created or modified (2009)
• Date received – if relevant / possible (2010)
22. Content Location
Locations of content are important :
• List primary locations (Network drive
location, Storage device, Bob’s shelf
• List locations of all backups/copies (CDs
in the storage room, weekly backup
tapes)
Remember to change locations as content
moves
23. Analyze the Results
When the inventory is complete, ask
yourselves what digital content
◦ do we have that we didn’t know about?
◦ should we be keeping that we aren’t
now?
◦ will we create or likely acquire in the
future?
◦ are we required to keep?
◦ do we need to review?
24. Goals
Identify potential digital content you
may need to preserve
Treat the inventory as a management
tool that grows as your preservation
program grows
Use it as a planning tool – e.g., to
prepare staff, training, annual growth
Use as a basis for acquiring
content, defining submission
agreements, plans
26. Six Training Modules
Identify - what digital content do you have?
Select - what portion of that content
will be preserved?
Store - how should your content be stored
for the long term?
Protect - what steps are needed to protect
your digital content?
Manage - what provisions are needed for
long-term management?
Provide - how should your content be made
available over time?
27. Why select content to
preserve?
Log jam on the St. Croix River, 1886
Wisconsin Historical Society WHi-2364
28. ● Cost: storage may be cheap,
management is not…especially over
time
● Discovery and dissemination
services: scale, scope, performance,
sustainability
● Quality of content may be variable
● Matching mission to content
Why select content to
preserve?
29. Basic Steps
Review your potential digital
content (go back to inventory)
Define - then apply - selection
criteria
Document (and preserve)
selection decisions
Implement your decisions
(Store, Protect, Manage, and
Provide modules)
Picking fruit
Wisconsin Historical Society WHi-67733
30. What criteria should be used to
select digital content for preservation?
Postal workers sorting mail, 1955
Wisconsin Historical Society WHi-36392
31. Selection Criteria
Mission: Scope of Collections, Collecting
Policies
Records retention manuals/policies (internal
or externally mandated)
Legal & ethical requirements (professional
bodies; your stakeholders; future users)
Uniqueness (only source or preserved
elsewhere? Avoid duplication)
Value (historical, evidential, can’t
reproduce?)
32. Practical Considerations
Stop if or when the answer is NO
● Content
– Does the content have long-term value?
– Does it fit your scope and mission?
● Technical
– Is it feasible for you to preserve the
content?
● Access
– Is it possible to make the content
available?
– Are you the only holder of this content?
33. Setting Priorities
Ask yourself which digital content is
● most significant to your organization?
● most extensive?
● most requested/used?
● easiest?
● oldest?
● newest?
● mandated?
● at risk?
34. Include Creators in the
Process
● Communication is key, particularly when
content comes from external creators
● Keep content creators in the conversation
● Arrange a convenient time for them to
talk about your preservation plans
● Identify list of materials to review with
them
● Document the results and send them a
copy
35. Selection Documentation
Supplement your inventory with more
detailed information about the material
you plan to preserve over the long term.
Use
◦ What’s the lifespan of the content?
◦ Will its value/use change over time?
◦ Retention period
36. Access and rights
Access
◦ How will the public access the content?
◦ Is access restricted? How? For how
long?
Rights
◦ Who owns the rights to preserve and
disseminate?
37. Prioritizing
Data criticality
◦ Is it only in digital form? Do we hold the
only copy?
Business/mission criticality
◦ If we lose it, what’s the damage to our
reputation? How will it impact our
function or services?
38. Goals/Outcomes
• Expanded inventory of content to
preserve
…and what you can delete (gray areas
identified)
• Agreements with content creators e.g.
submission agreements, retention
schedules
• Well-defined and documented selection
criteria, policies and procedures
• Better understanding of content for
future planning and growth
Greater knowledge = greater control!
39. Identify and Select in Practice
“You’ve Got to Walk Before You Can
Run: First Steps for Managing Born-
Digital Content Received on Physical
Media”
Ricky Erway, OCLC Research
http://www.oclc.org/content/dam/researc
h/publications/library/2012/2012-06.pdf
40. Four Essential Principles
Do no harm
Don’t do anything that prevents future
action and use
Take action
Document what you do
41. A Typical Scenario
Digital materials on physical media
(CDs, flash drives, floppy disks, etc.)
have been stored along with other
collection materials without having been
copied, preserved, or made accessible.
42. Inventory
1. Survey your holdings
2. Count and describe digital media within
collection
3. Remove media from collection (retain
order with photographs or separator
sheets)
4. Assign inventory numbers
5. Calculate amount of data
6. Re-house physical media in suitable
storage
43. Select
Prioritize for further treatment (e.g.
migration, online access) based on:
Significance and use of overall
collection
Danger of loss of content
(degradation)
Replication in analog form
Value of digital vs. analog format
Quantity of digital content