3. Metadata = documentation
Physical components of a work
Technical components of a work
Source code
Dependencies
Artist’s intent
Significant properties
The usual aspects of conservation documentation:
who, what, when, where, why, how.
6. PREMIS
PREservation Metadata: Implementation Strategies
Library of Congress metadata standard
http://www.loc.gov/standards/premis/
Includes elements for:
Significant properties
Hardware and software dependencies
Object and environment characteristics
Actions performed on objects
And more!
7. Metadata guidelines
Bare minimum:
Record all the documentation. Inventory everything,
including the documentation.
Keep the inventory, metadata, etc. backed up, including
one copy that is NOT co-located with the objects.
Best practice:
Use standards-based, machine-actionable metadata.
Consider PREMIS (PREservation Metadata: Implementation
Strategies).
8. Interlude
NDSA Levels of Digital Preservation
http://www.digitalpreservation.gov/ndsa/activities/l
evels.html
Four levels ranging from minimum accepted practice
to best practice
12. PHYSICAL storage media
“All digital is physical. They aren't literal clouds, folks.”
- Stephanie Gowler, Project Conservator at Northwestern University
14. Common default storage habits
Leave it sitting on the computer you used for disk
imaging.
Put it back on on an external hard drive and store it
in your desk drawer.
Transfer it to your repository and forget about it.
ALL OF THESE ARE BAD OPTIONS!
15. Physical storage media options
Media Type Long-Term Sustainability Cost
Optical media (CDs, DVDs) Terrible! Very short shelf life, high rate of
data loss
Dirt cheap
Removable/offline storage media
(External hard drives, tape)
Pretty good, low rate of data loss Fairly cheap
Online local disks (Internal
computer hard drives)
Pretty good, low rate of data loss Moderate cost
Redundant disk arrays (RAID) Very good, super low rate of data loss Getting pricey
Local network servers (NAS, SAN) Great! Super low rate of data loss and
experts are managing it
Expensive
Cloud servers Great (probably)! Super low rate of data
loss (probably) and experts are managing
it (probably)
Expensive, but varies based
on type
16. Storage guidelines
Bare minimum:
Two complete, identical copies on different types of
storage media
Ex: one copy on a local desktop computer, one copy on
tape backups
17. Location, location, location
Best practice:
Three complete, identical copies stored on different types
of storage media, in different geographic locations.
Ex: one copy on a local RAID array in NYC, one copy in
cloud server storage based in Utah, one copy on LTO tapes
in a vault in Michigan.
2
1
3
18. Some thoughts on backup
Continuous backup means file corruption gets
duplicated.
What’s the retention period?
Know the policies around this. Your IT staff are
focused on business continuity, NOT long-term
preservation. That’s your job.
20. { Who’s in charge of this stuff,
anyway?
Management
21. Why manage?
To ensure that files don’t get corrupted, lost,
damaged, or otherwise altered.
To ensure that files can still be used.
22. Management components
Metadata: Do you know what you have?
File fixity: Do you actually have what you think you
have?
Information security: What’s happening to what you
have?
Preventing obsolescence: Can you use what you
have?
23. File fixity
Checksums vs cryptographic hash algorithms
MD5 and SHA-1
Verification is key!
24. File fixity guidelines
Bare minimum:
Create checksums of everything on ingest.
(See: practical session)
Best practice:
VERIFY checksums periodically, and always when
performing these tasks:
o Ingest into a repository.
o Transferring files from one storage medium to another.
Replace damaged files with copies from another
location/storage medium.
25. Information security for non-IT folks
Manage access – particularly delete permissions
Log access and actions
Audit the logs periodically (or automate)
26. Information security guidelines
Bare minimum:
Know and document who has access to files.
Try to restrict (particularly write/delete) to as few people
as possible, and require multi-person authorization to
delete.
Best practice:
Maintain logs of all actions performed on files.
Audit those logs regularly to ensure no unintended actions
were performed.
27. Preventing obsolescence
Bare minimum:
Document information such as file formats, dependencies,
etc. (See: metadata)
Best practice:
Review format and dependency information regularly to
identify high-risk objects.
Perform migration, emulation tests, etc. as obsolescence
issues arise.
29. Image credits
Windows XP (slide 4): 2K Networking, Inc.
NDSA levels (slide 9): NDSA
Vault (slide 11): The Mark Consulting
Panicked cat (slide 13): Run Salt Run
Hard drive (slide 12): x2element
Computer (slide 16): Johan Larsson via Compfight cc
Wire tower (slide 16): tanakawho via Compfight cc
Hard drive (slide 19): Tech-addict
File not found (slide 21): Ragha’s Siebel Blog
No change (slide 23): Return on Focus
Delete (slide 25): The Ramblin Professor
Awake cat (slide 28): Desktop Nexus