This document discusses new strategies for managing manuscript collections using a flexible data model and digital infrastructure. It proposes moving away from EAD-centric approaches and prioritizing discovery needs. The proposed model integrates finding aids, catalog records, digitized materials and other sources into a single Fedora repository with linked data. This allows describing collections at different levels and from different sources. The system has been implemented at the University of Virginia and supports ingest, discovery and ongoing management of archival materials.
17. What are we doing?
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://ead.lib.virginia.edu/vivaead/published/document.xsl" type="text/xsl"?>
<ead xmlns="urn:isbn:1-931666-22-9" id="viu01215">
<eadheader audience="internal" langencoding="iso639-2b" findaidstatus="edited-full-draft" scriptencoding="iso15924" dateencoding="iso8601"
countryencoding="iso3166-1" repositoryencoding="iso15511">
<eadid publicid="PUBLIC &#34;-//University of Virginia::Library::Special Collections Dept.//TEXT (US::ViU::viu01215::A Guide to the Papers
of John Dos Passos 1865-1998)//EN &#34;viu01215.xml&#34;" countrycode="US" mainagencycode="US-ViU">PUBLIC
"-//University of Virginia::Library::Special Collections
Dept.//TEXT (US::ViU::viu01215::A Guide to the Papers of
John Dos Passos 1865-1998)//EN "viu01215.xml"</eadid>
<filedesc>
<titlestmt>
<titleproper>A Guide to the Papers of John Dos Passos
<date era="ce" calendar="gregorian">1865-1998</date></titleproper>
<subtitle id="sort">Dos Passos, John, Papers
<num type="collectionnumber">5950</num></subtitle>
<author>Special Collections Staff</author>
</titlestmt>
<publicationstmt>
<publisher>Special Collections, University of Virginia Library
</publisher>
18. How is this different?
Think about finding aids not EADs
Let the archivists focus on what they do best
IP landscapes are flexible
human and machine actionable
Avoid the digitization / description dilemma
Not worry about variable levels of description
Optimized for digital surrogates and born digital content
20. Challenges with integrating search results
Relational database could get quite large
Complex data storage model
Migration of legacy content
21. Looking ahead
Granular circulation
Risk Management
Create virtual collections
Alternate metadata options for description
Archival prioritization of search results
22. Data Model Constraints
Unknowns
Metadata
Format
Publication-ready?
Unique ids?
Workflows to support
Only “complete” collections ingested?
Editing after ingestion?
Editing happen in the repository or out of the repository?
23. Data Model Goals
Allow for multiple hierarchies to describe the same resources
Allow for metadata in various formats
Support ingest of “finished” EADs but anticipate future edits,
replacements and reorganizations
24. Data Model
From Finding Aid
Collection
Component
Component
Component
Item Item Item
Item Item Item
Item Item Item
25. Data Model
Collection
From Finding Aid
Compo
Component
Compo
nent
nent
Ite
Item Ite
Item
Ite Ite
Item
Ite
Ite
m m
m m
m
m
26. Data Model
Collection MARC
MARC
From Finding Aid MARC
From Catalog
Compo
Component
Compo Container Container
nent
nent
Ite
Item Ite
Item
Ite Ite
Item
Ite
Ite
m m
m m
m
m
27. Data Model
Collection MARC
MARC
From Finding Aid MARC
From Catalog
Compo
Component
Compo Container Container
nent
nent
Ite
Item Ite
Item
Ite Ite
Item
Ite
Ite
m m
m m
m
m
28. Data Model
Collection MARC
MARC
From Finding Aid MARC
From Catalog
Compo
Component
Compo Container Container
nent
nent
Ite
Item Ite
Item
Ite Ite
Item
Ite
Ite
m m
m m
m
m
29. Data Model
Collection MARC
MARC
From Finding Aid MARC
From Catalog
Compo
Component
Compo Container Container
nent
nent
Ite
Item Ite
Item
Ite Ite
Item
Ite
Ite Digitized
m m
m m
m
From Digitization and
m Item patron request workflow
Digitized
Digitized
File
Digitized
File
File
30. Data Model
Collectio
MARC
n
From Finding Aid
From Catalog
Component
Comp
Comp
Container Container
Item From Digitization
Item Item
Ite
Ite Ite Ite
Ite Digitized
and patron request
workflow
Item
Digitized
Digitized
File
Digitized
File
File
Digitized
Collectio Item
Item
n
Item
From Finding Aid or
other collection
Item Digitized
description source Digitized
File
Digitized
File
File
31. Fedora Metadata Philosophy
“Catalog in the format that is most suited to your materials but disseminate in
the format that’s most suited to your use”
32. Metadata Model
R C
A
Collection MARC
M
Compo
ML
Compo
X
Component Container Container
nent
D
nent
EA Ite
Item
m
Ite
Item
Ite ItemIte
Ite
m Digitized
m m m Item
Digitized File
Digitized File
D
Digitized File
A
Digitized
E L
Item
Collection Item
Item
M
Item
Digitized File
Digitized File
Digitized File
34. Indexing
Philosophy
Based around discovery and presentation needs
Technical Implementation
XSLT-based Fedora Disseminator
Pulls data from the entire RDF graph to build index records
Reindexing would be triggered by editing or submission workflow
Solr Index serves to cache collection structure and metadata
35. Discovery Interface
VIRGO
Blacklight
Ruby on rails
Solr
Custom integrations
Fedora
ILS (Sirsi)
PRIMO
36. Development Process
Centered on User Experience
Started with wireframes
Included major stakeholders from the beginning
Balanced competing needs
Archivists
Asserts the importance of the context, collection and archival
descriptive practice.
Researcher
Wants to be able to find all relevant materials across traditional silos.
Web surfer
Cares less about where something came from and more about being
37.
38.
39.
40.
41.
42.
43.
44.
45.
46. Development Status
“Complete”
Data model
EAD Fedora processing/ingest
UI enhancements to Virgo (Blacklight)
Short term goals
Include large volume of finding aids
Implement robust policy support
Refine the user interface as needed
Longer term goals
Place robust archival description tools on top of the Fedora Data Model