SlideShare une entreprise Scribd logo
1  sur  29
{
Disk image! … and then what?
Strategies for sustainable long-term storage
and access
{ Document stuff about your stuff
Metadata
Metadata = documentation
 Physical components of a work
 Technical components of a work
 Source code
 Dependencies
 Artist’s intent
 Significant properties
 The usual aspects of conservation documentation:
who, what, when, where, why, how.
Example
 Environment: Windows XP operating system
 Simple! Straightforward!
Metadata ≠ documentation
 Documentation:
 Scattered
 Free-form
 Human-readable
 Metadata:
 Unified
 Standards-based
 Machine actionable
 Interoperable
PREMIS
 PREservation Metadata: Implementation Strategies
 Library of Congress metadata standard
 http://www.loc.gov/standards/premis/
 Includes elements for:
 Significant properties
 Hardware and software dependencies
 Object and environment characteristics
 Actions performed on objects
 And more!
Metadata guidelines
 Bare minimum:
 Record all the documentation. Inventory everything,
including the documentation.
 Keep the inventory, metadata, etc. backed up, including
one copy that is NOT co-located with the objects.
 Best practice:
 Use standards-based, machine-actionable metadata.
 Consider PREMIS (PREservation Metadata: Implementation
Strategies).
Interlude
 NDSA Levels of Digital Preservation
 http://www.digitalpreservation.gov/ndsa/activities/l
evels.html
 Four levels ranging from minimum accepted practice
to best practice
{ Stuff’s gotta go somewhere
Storage
What is a repository?
PHYSICAL storage media
“All digital is physical. They aren't literal clouds, folks.”
- Stephanie Gowler, Project Conservator at Northwestern University
Don’t panic!
Common default storage habits
 Leave it sitting on the computer you used for disk
imaging.
 Put it back on on an external hard drive and store it
in your desk drawer.
 Transfer it to your repository and forget about it.
ALL OF THESE ARE BAD OPTIONS!
Physical storage media options
Media Type Long-Term Sustainability Cost
Optical media (CDs, DVDs) Terrible! Very short shelf life, high rate of
data loss
Dirt cheap
Removable/offline storage media
(External hard drives, tape)
Pretty good, low rate of data loss Fairly cheap
Online local disks (Internal
computer hard drives)
Pretty good, low rate of data loss Moderate cost
Redundant disk arrays (RAID) Very good, super low rate of data loss Getting pricey
Local network servers (NAS, SAN) Great! Super low rate of data loss and
experts are managing it
Expensive
Cloud servers Great (probably)! Super low rate of data
loss (probably) and experts are managing
it (probably)
Expensive, but varies based
on type
Storage guidelines
 Bare minimum:
 Two complete, identical copies on different types of
storage media
 Ex: one copy on a local desktop computer, one copy on
tape backups
Location, location, location
 Best practice:
 Three complete, identical copies stored on different types
of storage media, in different geographic locations.
 Ex: one copy on a local RAID array in NYC, one copy in
cloud server storage based in Utah, one copy on LTO tapes
in a vault in Michigan.
2
1
3
Some thoughts on backup
 Continuous backup means file corruption gets
duplicated.
 What’s the retention period?
 Know the policies around this. Your IT staff are
focused on business continuity, NOT long-term
preservation. That’s your job.
Replacement
 Roughly every five years
 BEFORE failure occurs!
{ Who’s in charge of this stuff,
anyway?
Management
Why manage?
 To ensure that files don’t get corrupted, lost,
damaged, or otherwise altered.
 To ensure that files can still be used.
Management components
 Metadata: Do you know what you have?
 File fixity: Do you actually have what you think you
have?
 Information security: What’s happening to what you
have?
 Preventing obsolescence: Can you use what you
have?
File fixity
 Checksums vs cryptographic hash algorithms
 MD5 and SHA-1
 Verification is key!
File fixity guidelines
 Bare minimum:
 Create checksums of everything on ingest.
 (See: practical session)
 Best practice:
 VERIFY checksums periodically, and always when
performing these tasks:
o Ingest into a repository.
o Transferring files from one storage medium to another.
 Replace damaged files with copies from another
location/storage medium.
Information security for non-IT folks
 Manage access – particularly delete permissions
 Log access and actions
 Audit the logs periodically (or automate)
Information security guidelines
 Bare minimum:
 Know and document who has access to files.
 Try to restrict (particularly write/delete) to as few people
as possible, and require multi-person authorization to
delete.
 Best practice:
 Maintain logs of all actions performed on files.
 Audit those logs regularly to ensure no unintended actions
were performed.
Preventing obsolescence
 Bare minimum:
 Document information such as file formats, dependencies,
etc. (See: metadata)
 Best practice:
 Review format and dependency information regularly to
identify high-risk objects.
 Perform migration, emulation tests, etc. as obsolescence
issues arise.
Still awake? Questions?
Thank you!
Image credits
 Windows XP (slide 4): 2K Networking, Inc.
 NDSA levels (slide 9): NDSA
 Vault (slide 11): The Mark Consulting
 Panicked cat (slide 13): Run Salt Run
 Hard drive (slide 12): x2element
 Computer (slide 16): Johan Larsson via Compfight cc
 Wire tower (slide 16): tanakawho via Compfight cc
 Hard drive (slide 19): Tech-addict
 File not found (slide 21): Ragha’s Siebel Blog
 No change (slide 23): Return on Focus
 Delete (slide 25): The Ramblin Professor
 Awake cat (slide 28): Desktop Nexus

Contenu connexe

Tendances

Encase Forensic
Encase ForensicEncase Forensic
Encase ForensicMegha Sahu
 
Learning Session 2: Computer Basics, Operating Systems, File Management, an...
Learning Session 2:   Computer Basics, Operating Systems, File Management, an...Learning Session 2:   Computer Basics, Operating Systems, File Management, an...
Learning Session 2: Computer Basics, Operating Systems, File Management, an...Clint Born
 
Forensics of a Windows System
Forensics of a Windows SystemForensics of a Windows System
Forensics of a Windows SystemConferencias FIST
 
Health Record Identification and Filing Systems
Health Record Identification and Filing SystemsHealth Record Identification and Filing Systems
Health Record Identification and Filing SystemsElisha Musasizi
 
multimedia storage and playback or retrival methods or techniques copy
multimedia storage and playback or retrival methods or techniques   copymultimedia storage and playback or retrival methods or techniques   copy
multimedia storage and playback or retrival methods or techniques copymrzahidfaiz.blogspot.com
 

Tendances (8)

Encase Forensic
Encase ForensicEncase Forensic
Encase Forensic
 
Learning Session 2: Computer Basics, Operating Systems, File Management, an...
Learning Session 2:   Computer Basics, Operating Systems, File Management, an...Learning Session 2:   Computer Basics, Operating Systems, File Management, an...
Learning Session 2: Computer Basics, Operating Systems, File Management, an...
 
Forensics of a Windows System
Forensics of a Windows SystemForensics of a Windows System
Forensics of a Windows System
 
HDFS Basics
HDFS BasicsHDFS Basics
HDFS Basics
 
Health Record Identification and Filing Systems
Health Record Identification and Filing SystemsHealth Record Identification and Filing Systems
Health Record Identification and Filing Systems
 
Computers12 Ch6
Computers12 Ch6Computers12 Ch6
Computers12 Ch6
 
multimedia storage and playback or retrival methods or techniques copy
multimedia storage and playback or retrival methods or techniques   copymultimedia storage and playback or retrival methods or techniques   copy
multimedia storage and playback or retrival methods or techniques copy
 
NISO Two-Part Webinar: Sustainable Information Part 1: Digital Preservation f...
NISO Two-Part Webinar: Sustainable Information Part 1: Digital Preservation f...NISO Two-Part Webinar: Sustainable Information Part 1: Digital Preservation f...
NISO Two-Part Webinar: Sustainable Information Part 1: Digital Preservation f...
 

En vedette

Workshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital HumanitiesWorkshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital HumanitiesHelen Bailey
 
Approaches to Archiving Professional Blogs Hosted in the Cloud
Approaches to Archiving Professional Blogs Hosted in the CloudApproaches to Archiving Professional Blogs Hosted in the Cloud
Approaches to Archiving Professional Blogs Hosted in the CloudMarieke Guy
 
LUXi NYC Intro to Customer Development
LUXi NYC Intro to Customer DevelopmentLUXi NYC Intro to Customer Development
LUXi NYC Intro to Customer DevelopmentLane Goldstone
 
Cluster based landmark and event detection for tagged photo collections
Cluster based landmark and event detection for tagged photo collectionsCluster based landmark and event detection for tagged photo collections
Cluster based landmark and event detection for tagged photo collectionsSymeon Papadopoulos
 
Text Analysis Methods for Digital Humanities
Text Analysis Methods for Digital HumanitiesText Analysis Methods for Digital Humanities
Text Analysis Methods for Digital HumanitiesHelen Bailey
 
TechDay - April - Customizing VM Images
TechDay - April - Customizing VM ImagesTechDay - April - Customizing VM Images
TechDay - April - Customizing VM ImagesOpenNebula Project
 
Storage best practices
Storage best practicesStorage best practices
Storage best practicesMaor Lipchuk
 
I want to know more about compuerized text analysis
I want to know more about   compuerized text analysisI want to know more about   compuerized text analysis
I want to know more about compuerized text analysisLuke Czarnecki
 
Assembling and deassembling
Assembling and deassemblingAssembling and deassembling
Assembling and deassemblingOnline
 
Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)Novell
 

En vedette (15)

Workshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital HumanitiesWorkshop Exercise: Text Analysis Methods for Digital Humanities
Workshop Exercise: Text Analysis Methods for Digital Humanities
 
Ouconf2010
Ouconf2010Ouconf2010
Ouconf2010
 
Approaches to Archiving Professional Blogs Hosted in the Cloud
Approaches to Archiving Professional Blogs Hosted in the CloudApproaches to Archiving Professional Blogs Hosted in the Cloud
Approaches to Archiving Professional Blogs Hosted in the Cloud
 
LUXi NYC Intro to Customer Development
LUXi NYC Intro to Customer DevelopmentLUXi NYC Intro to Customer Development
LUXi NYC Intro to Customer Development
 
Cluster based landmark and event detection for tagged photo collections
Cluster based landmark and event detection for tagged photo collectionsCluster based landmark and event detection for tagged photo collections
Cluster based landmark and event detection for tagged photo collections
 
Text Analysis Methods for Digital Humanities
Text Analysis Methods for Digital HumanitiesText Analysis Methods for Digital Humanities
Text Analysis Methods for Digital Humanities
 
TechDay - April - Customizing VM Images
TechDay - April - Customizing VM ImagesTechDay - April - Customizing VM Images
TechDay - April - Customizing VM Images
 
Storage best practices
Storage best practicesStorage best practices
Storage best practices
 
Pc assembly
Pc assemblyPc assembly
Pc assembly
 
I want to know more about compuerized text analysis
I want to know more about   compuerized text analysisI want to know more about   compuerized text analysis
I want to know more about compuerized text analysis
 
Assembling the computer
Assembling the computerAssembling the computer
Assembling the computer
 
Pc Assembling
Pc AssemblingPc Assembling
Pc Assembling
 
Assembling and deassembling
Assembling and deassemblingAssembling and deassembling
Assembling and deassembling
 
Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)Virtualization with KVM (Kernel-based Virtual Machine)
Virtualization with KVM (Kernel-based Virtual Machine)
 
Disassembling a PC
Disassembling a PCDisassembling a PC
Disassembling a PC
 

Similaire à Disk Image!...and then what? Strategies for sustainable long-term storage and access

Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...WiLS
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA'saaroncollie
 
File Management (1).pptx
File Management (1).pptxFile Management (1).pptx
File Management (1).pptxSolomonAnab1
 
Data Management for Undergraduate Researchers (updated - 02/2016)
Data Management for Undergraduate Researchers (updated - 02/2016)Data Management for Undergraduate Researchers (updated - 02/2016)
Data Management for Undergraduate Researchers (updated - 02/2016)Rebekah Cummings
 
FOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack WorkshopFOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack Workshopdlieberman
 
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? panagenda
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object Sandeep Patil
 
002-Storage Basics and Application Environments V1.0.pptx
002-Storage Basics and Application Environments V1.0.pptx002-Storage Basics and Application Environments V1.0.pptx
002-Storage Basics and Application Environments V1.0.pptxDrewMe1
 
Wk 1 - File organization.pptx
Wk 1 - File organization.pptxWk 1 - File organization.pptx
Wk 1 - File organization.pptxDORCASGABRIEL1
 
Distributed File System
Distributed File SystemDistributed File System
Distributed File SystemNtu
 
Understanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) EnvironmentUnderstanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) EnvironmentAdetula Bunmi
 
Distributed file systems
Distributed file systemsDistributed file systems
Distributed file systemsSri Prasanna
 
What is Object storage ?
What is Object storage ?What is Object storage ?
What is Object storage ?Nabil Kassi
 
Everyone's A Mechanic
Everyone's A MechanicEveryone's A Mechanic
Everyone's A MechanicBrad Houston
 
Disk forensics for the lazy and the smart
Disk forensics for the lazy and the smartDisk forensics for the lazy and the smart
Disk forensics for the lazy and the smartJeff Beley
 

Similaire à Disk Image!...and then what? Strategies for sustainable long-term storage and access (20)

Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...Digitization Basics for Archives and Special Collections – Part 2: Store and ...
Digitization Basics for Archives and Special Collections – Part 2: Store and ...
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA's
 
File Management (1).pptx
File Management (1).pptxFile Management (1).pptx
File Management (1).pptx
 
Data Management for Undergraduate Researchers (updated - 02/2016)
Data Management for Undergraduate Researchers (updated - 02/2016)Data Management for Undergraduate Researchers (updated - 02/2016)
Data Management for Undergraduate Researchers (updated - 02/2016)
 
FOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack WorkshopFOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack Workshop
 
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year? BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
BP301: Q: What’s Your Second Most Valuable Asset and Nearly Doubles Every Year?
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 
File organisation
File organisationFile organisation
File organisation
 
002-Storage Basics and Application Environments V1.0.pptx
002-Storage Basics and Application Environments V1.0.pptx002-Storage Basics and Application Environments V1.0.pptx
002-Storage Basics and Application Environments V1.0.pptx
 
Wk 1 - File organization.pptx
Wk 1 - File organization.pptxWk 1 - File organization.pptx
Wk 1 - File organization.pptx
 
Distributed File System
Distributed File SystemDistributed File System
Distributed File System
 
Andrew Waugh presentation
Andrew Waugh   presentationAndrew Waugh   presentation
Andrew Waugh presentation
 
Andrew Waugh
Andrew WaughAndrew Waugh
Andrew Waugh
 
Understanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) EnvironmentUnderstanding EDP (Electronic Data Processing) Environment
Understanding EDP (Electronic Data Processing) Environment
 
Distributed file systems
Distributed file systemsDistributed file systems
Distributed file systems
 
What is Object storage ?
What is Object storage ?What is Object storage ?
What is Object storage ?
 
Everyone's A Mechanic
Everyone's A MechanicEveryone's A Mechanic
Everyone's A Mechanic
 
Chapter 12.pptx
Chapter 12.pptxChapter 12.pptx
Chapter 12.pptx
 
Disk forensics for the lazy and the smart
Disk forensics for the lazy and the smartDisk forensics for the lazy and the smart
Disk forensics for the lazy and the smart
 
Andrew waugh
Andrew waughAndrew waugh
Andrew waugh
 

Dernier

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Disk Image!...and then what? Strategies for sustainable long-term storage and access

  • 1. { Disk image! … and then what? Strategies for sustainable long-term storage and access
  • 2. { Document stuff about your stuff Metadata
  • 3. Metadata = documentation  Physical components of a work  Technical components of a work  Source code  Dependencies  Artist’s intent  Significant properties  The usual aspects of conservation documentation: who, what, when, where, why, how.
  • 4. Example  Environment: Windows XP operating system  Simple! Straightforward!
  • 5. Metadata ≠ documentation  Documentation:  Scattered  Free-form  Human-readable  Metadata:  Unified  Standards-based  Machine actionable  Interoperable
  • 6. PREMIS  PREservation Metadata: Implementation Strategies  Library of Congress metadata standard  http://www.loc.gov/standards/premis/  Includes elements for:  Significant properties  Hardware and software dependencies  Object and environment characteristics  Actions performed on objects  And more!
  • 7. Metadata guidelines  Bare minimum:  Record all the documentation. Inventory everything, including the documentation.  Keep the inventory, metadata, etc. backed up, including one copy that is NOT co-located with the objects.  Best practice:  Use standards-based, machine-actionable metadata.  Consider PREMIS (PREservation Metadata: Implementation Strategies).
  • 8. Interlude  NDSA Levels of Digital Preservation  http://www.digitalpreservation.gov/ndsa/activities/l evels.html  Four levels ranging from minimum accepted practice to best practice
  • 9.
  • 10. { Stuff’s gotta go somewhere Storage
  • 11. What is a repository?
  • 12. PHYSICAL storage media “All digital is physical. They aren't literal clouds, folks.” - Stephanie Gowler, Project Conservator at Northwestern University
  • 14. Common default storage habits  Leave it sitting on the computer you used for disk imaging.  Put it back on on an external hard drive and store it in your desk drawer.  Transfer it to your repository and forget about it. ALL OF THESE ARE BAD OPTIONS!
  • 15. Physical storage media options Media Type Long-Term Sustainability Cost Optical media (CDs, DVDs) Terrible! Very short shelf life, high rate of data loss Dirt cheap Removable/offline storage media (External hard drives, tape) Pretty good, low rate of data loss Fairly cheap Online local disks (Internal computer hard drives) Pretty good, low rate of data loss Moderate cost Redundant disk arrays (RAID) Very good, super low rate of data loss Getting pricey Local network servers (NAS, SAN) Great! Super low rate of data loss and experts are managing it Expensive Cloud servers Great (probably)! Super low rate of data loss (probably) and experts are managing it (probably) Expensive, but varies based on type
  • 16. Storage guidelines  Bare minimum:  Two complete, identical copies on different types of storage media  Ex: one copy on a local desktop computer, one copy on tape backups
  • 17. Location, location, location  Best practice:  Three complete, identical copies stored on different types of storage media, in different geographic locations.  Ex: one copy on a local RAID array in NYC, one copy in cloud server storage based in Utah, one copy on LTO tapes in a vault in Michigan. 2 1 3
  • 18. Some thoughts on backup  Continuous backup means file corruption gets duplicated.  What’s the retention period?  Know the policies around this. Your IT staff are focused on business continuity, NOT long-term preservation. That’s your job.
  • 19. Replacement  Roughly every five years  BEFORE failure occurs!
  • 20. { Who’s in charge of this stuff, anyway? Management
  • 21. Why manage?  To ensure that files don’t get corrupted, lost, damaged, or otherwise altered.  To ensure that files can still be used.
  • 22. Management components  Metadata: Do you know what you have?  File fixity: Do you actually have what you think you have?  Information security: What’s happening to what you have?  Preventing obsolescence: Can you use what you have?
  • 23. File fixity  Checksums vs cryptographic hash algorithms  MD5 and SHA-1  Verification is key!
  • 24. File fixity guidelines  Bare minimum:  Create checksums of everything on ingest.  (See: practical session)  Best practice:  VERIFY checksums periodically, and always when performing these tasks: o Ingest into a repository. o Transferring files from one storage medium to another.  Replace damaged files with copies from another location/storage medium.
  • 25. Information security for non-IT folks  Manage access – particularly delete permissions  Log access and actions  Audit the logs periodically (or automate)
  • 26. Information security guidelines  Bare minimum:  Know and document who has access to files.  Try to restrict (particularly write/delete) to as few people as possible, and require multi-person authorization to delete.  Best practice:  Maintain logs of all actions performed on files.  Audit those logs regularly to ensure no unintended actions were performed.
  • 27. Preventing obsolescence  Bare minimum:  Document information such as file formats, dependencies, etc. (See: metadata)  Best practice:  Review format and dependency information regularly to identify high-risk objects.  Perform migration, emulation tests, etc. as obsolescence issues arise.
  • 29. Image credits  Windows XP (slide 4): 2K Networking, Inc.  NDSA levels (slide 9): NDSA  Vault (slide 11): The Mark Consulting  Panicked cat (slide 13): Run Salt Run  Hard drive (slide 12): x2element  Computer (slide 16): Johan Larsson via Compfight cc  Wire tower (slide 16): tanakawho via Compfight cc  Hard drive (slide 19): Tech-addict  File not found (slide 21): Ragha’s Siebel Blog  No change (slide 23): Return on Focus  Delete (slide 25): The Ramblin Professor  Awake cat (slide 28): Desktop Nexus