101 crash course on the fundamentals of digitizing archival collections from start to finish
---
This introductory level presentation discusses the basics of digitizing collections from start to finish. The author reveals some secrets as well as tips and tricks for achieving efficiency and sustainability of digital projects. All libraries have unique collections that deserve to gain more publicity. This crash course targets librarians passionate to learn how to create efficient workflows and explains in details all steps involved in digitization - from selection, through preparation, digitization, object description (metadata) to publishing online.
1. DIGITIZATION REVEALED
101 crash course on the fundamentals of digitizing
archival collections from start to finish
MARINA GEORGIEVA
Nevada Library Association Annual Conference
October 14, 2018
Las Vegas, NV
2. ABOUT ME
Graduated from University of Wisconsin – Milwaukee
MLIS with concentration in Information technologies
Digital libraries
Graphic and Web design
Metadata
Information architecture and UX design
Visiting Digital Collections Librarian at University of
Nevada – Las Vegas
Professional passion
Project management
Digitization
Metadata design and management
Digitizing archival manuscripts on
PhaseOne camera (2018)
More about my work at
www.marina-expertise.com
3. TOPICS
Collection Inventory and Assessment
Funding
Digitization technology
Team
Workflow
Platform
Standards and sustainability
4. COLLECTION INVENTORY AND ASSESSMENT
WHY DIGITIZE?
• For access
• For preservation
• On demand
• For specific occasion/exhibit
ANALYZE THE COLLECTION
• Priority list
• Research value
• Collection size
• Collection level of processing
(Finding Aid)
• Reuse of existing Finding Aid
• Condition of materials
• Format of materials
• Has it been digitized as part of
another project?
ASSESSMENT
• Large scale or boutique
approach?
• Digitize all or selected materials?
• Who does selection?
• In-house capacities
• Outsourcing options
6. INTERNAL FUNDING
• Types of digitization projects
• Reducing archival backlog
• Preservation
• Access (esp. high demand collections)
• Scan on demand (for class or event)
• Staffing
• Existing staff
• Students assistants
• Interns
• Volunteers
GRANT FUNDING
• Types of digitization projects
• Specific project that fits the grant scope
• High priority collection that requires more resources and
is eligible for grant
• High research value collections
• Collections in desperate need for preservation
• Staffing
• Specially hired and trained project staff
• Specially trained students assistants
FUNDING 2|3
1
Funding types
7. FUNDING 3|3
2
Grant-funding opportunities
National Endowment for the Humanities
https://www.neh.gov/grants/listing?keywords=digitization
Library Services and Technology Act
https://nsla.libguides.com/2018LSTA
Institute of Museum and Library Services
https://www.imls.gov/grants/apply-grant/available-grants
National Historical Publications & Records Commission
https://www.archives.gov/nhprc/apply/eligibility.html
9. DIGITIZATION TECHNOLOGY 2|3
1
Types of digitization technologies
Flatbed scanner
Large format scanner Transparencies scanner
Book scanner Camera systems
Microfilm scanner
10. DIGITIZATION TECHNOLOGY 3|3
2
Selecting the best fit
CONSIDER THE TECHNOLOGY
• Equipment you already have
• Equipment you need to purchase
• Equipment you need to outsource
CONSIDER THE
COLLECTION
• Format of archival materials
• Microfilms
• Transparencies
• Reflective materials
• Oversized materials
• Manuscripts
• Books
CONSIDER THE PROJECT
• Scale
• Scope
• Timeline
• Team
12. DECISION-MAKING
• Team size
• Roles (positions)
• Part-time vs. full-time
HIRING
• Internal staff vs. external hires
• Experienced vs. non-trained
• Professional staff vs. student
assistants
1
Assessment| decision-making
PROJECT ASSESSMENT
• Grant-funded vs.
internally funded
• Project priority
• Project deadlines
• Project specifics
• Large-scale or small
• Scan on demand
• Target audience
TEAM 2|3
13. ASSIGNING ROLES
• Train narrow specialists for particular
tasks vs. train people universally
• Advantages and disadvantages
• Main roles in digitization:
• Workflow manager
• Staff manager
• Metadata manager
• Metadata creators
• Digitization specialists
TRAINING
• New hires training
• Refreshers
• On-going training
• Training documentation
• Self-training guidelines
2
Roles | Training
TEAM 3|3
15. WORKFLOW 2|10
STATUS CONTROL
FILE
• Comprehensive list of all
collections in a project
• Provides vital information
for tracking progress
• Digital ID
• Collection name
• Workflow segments
• Initials and completion
dates
Some definitions
MASTERFILE
• Comprehensive list of all items
in a batch
• Fields are replica of collection
MAP
• Used for data import in DAMS
• Records initial descriptive and
technical metadata
• Digital IDs
• Descriptions
• Technology
• Formats
DIGITAL OBJECTS
• Compound objects
• Single objects
• Not necessarily replica of the structure
of the archival folder
ARCHIVAL OBJECTS
• A.k.a physical objects
• Described in Finding Aids
• Box level
• Folder level
• Item level
17. WORKFLOW 4|10
1
Preparation of physical materials
OBTAINING COLLECTION
• Finding Aid
• Get collection
• Collection inventory
• Does Finding Aid
description correspond to
what’s in boxes?
• Is Finding Aid description
on folder level or on item
level?
PROCESSING COLLECTION
• Condition of materials
• Documenting objects
• Folder vs item level decision
• Grouping for efficient digitization
• Updating documentation
DOCUMENTATION
• Analyze Finding Aid
• Prepare MasterFile
• Object type
• single objects
• compound objects
• textual
• visual
• Object format
• negatives
• reflective materials
• 3D objects
18. WORKFLOW 5|10
DIGITIZATION
• Scan!
• Update documentation
simultaneously
• Keep track of items and
batches
• Transcribe all peculiarities of
the objects
DIGITIZATION PREP
• Technology selection
• Set up scanning sessions
• Group in batches if
materials are
heterogeneous
• Set up documentation
• MasterFile
• Status Control File
• Set up file naming
Digitization of archival materials
2
19. WORKFLOW 6|10
FILE NAMING
• Digital IDs are critical for object
retrieval and file preservation
• Double-check file names
• Follow the collection naming
conventions
• Organize compound objects
• Update documentation
ACTIONS
• Straightening
• Cropping
• Color correction
• OCR (textual objects)
The sequence of actions and used
software depends on the technology
used for scanning
3
Image processing
EXPORTING
• Create a collection
destination folder for
temporary home of
archival images
• Export TIFFs and JPGs
• Organize compound
objects
20. WORKFLOW 7|10
IMPORT
• Compound objects w/
parent level metadata
• Compound objects w/
children level metadata
• Single objects
• OCR’ed objects w/ .txt files
• Troubleshoot if necessary
DOCUMENTATION
• Assign batch number
• Update MasterFile
• Update Status Control File
• Generate Tab-delimited .txt
file for the import
• Get the path to the archival
images
ASSIGN
• Check the transcripts (for
OCR’ed objects!)
• Communicate the batch
number to the metadata
creators
4
Batch import
21. WORKFLOW 8|10
5
Metadata (object description)
METADATA ROLES
• Administrator
• Managers
• Creators
OVERVIEW
• Object description
• Promotes consistency
across collections
• Enhances easy object
retrieval
• Supports faceting
• Supports subject searching
| browsing
Most important digitization segment from user perspective!
METADATA GRANULARITY
• Rich vs. basic metadata
• Collections size – granularity
differs across collections
• Large-scale projects
• Boutique collections
• Digital exhibits
• Material format – some collections
need less granularity
• Newspaper collections
22. WORKFLOW 9|10
INDEXING GUIDELINES
• Cheat sheet for
metadata creators
• Clarifies MAP
• Disambiguates
• Rules and examples
METADATA
APPLICATION PROFILE
• Fields
• Encoding scheme
• Occurrence
• Obligation
• Collection-specific vs.
shared
CONTROLLED VOCABULARIES
• Local CVs
• Authority files
• AAT | Art and Architecture Thesaurus
• FAST | Faceted Application of Subject Terminology
• LCSH | Library of Congress Subject Headings
• LCNAF | Library of Congress Name Authorities
• MESH | Medical Subject Headings
• TGN | Getty Thesaurus of Geographic Names
• ULAN | Union List of Artist Names
• Collection specific vs shared
5
Metadata (object description)
Most important digitization segment from user perspective!
23. WORKFLOW 10|10
FEEDBACK AND REVISIONS
• Communicate problems to
metadata creators
• Revise and correct
• Keep list of typical mistakes for
training purposes
ACTIONS
• Image quality
• Metadata quality
• Transcripts
• OCR quality
6
Quality review
GOAL
High quality metadata
= consistent collections
= linked data readiness
= increased usability of digital collections
= happy researchers!
24. PLATFORM
BACK-END FEATURES
• Easy upload of data sets
• Easy export of data sets
• Easy to use by staff
• Intuitive administrator interface
• Supports multiple file formats
• Supports shared vocabularies
across collections
• Cloud-based vs. server-based
The DAMS interface is your presence in the web! Keep it neat to provide outstanding user experience!
FRONT-END FEATURES
• Easy navigation
• User-friendly
• Customizable
• Responsive design
• Supports multiple collections
• Supports searching and
browsing
• Supports advanced search
• Supports faceting
POPULAR DAMS
25. STANDARDS AND
SUSTAINABILITY
SUSTAINABILITY
• Iterative approach
• Reuse, adapt, improve existing workflows
• Cross-collection controlled vocabularies
• Adopt best practices
• Have internal procedures and guidelines in
place
STANDARDS
• Dublin Core Metadata Element Set (DC)
• Metadata Object Description Schema (MODS)
• Metadata Encoding and Transmission Standard
(METS)
• eXtensible Markup Language (XML)
• Encoded Archival Standard (EAD)
26. THANK YOU!
QUESTIONS? Visiting Digital Collections Librarian
University of Nevada - Las Vegas
University Libraries
4505 S Maryland Parkway
Box 457041
Las Vegas, NV 89154
Tel: 702-895-2310
marina.georgieva@unlv.edu
www.marina-expertise.com
Notes de l'éditeur
The agenda today will provide a large picture of what digitization project looks like.
Format of archival materials – one type or a mix?
Project scale – is it large-scale or small scale? Does speed and efficiency matter?
Project timeline – how tight the final deadlines are? Are they flexible?
Project team – do you have staff trained to use the piece of equipment you are planning to use? Can you provide training?
AAT: Art and Architecture Thesaurus
FAST: Faceted Application of Subject Terminology
LCSH: Library of Congress Subject Headings
MESH: Medical Subject Headings
TGN: Getty Thesaurus of Geographic Names
ULAN: Union List of Artist Names
AAT: Art and Architecture Thesaurus
FAST: Faceted Application of Subject Terminology
LCSH: Library of Congress Subject Headings
MESH: Medical Subject Headings
TGN: Getty Thesaurus of Geographic Names
ULAN: Union List of Artist Names