5. DPLA Archival Description Working
Group
• “explore solutions to support both item-
level and aggregate-level approaches to
digital object description and access.”
• Whitepaper outlining research and
recommendations, to be released in 2016
6. Working Group Members
• Gretchen Gueguen (Chair/DPLA)
• Jodi-Allison Bunnel (Orbis Cascade Alliance)
• Mark Custer (Yale University)
• Bradley Daigle (University of Virginia)
• Jacqueline Dean (University of North Carolina, Chapel Hill)
• Max Eckard (University of Michigan)
• Ben Goldman (Penn State University)
• Leigh Grinstead (LYRASIS)
• Kris Keisling (University of Minnesota)
• Adrian Turner (California Digital Library)
• Shawn Averkamp (New York Public Library)
• Erin Hawkins (World Digital Library, Library of Congress)
• Sheila McAlister (Digital Library of Georgia)
• Sandra McIntyre (Mountain West Digital Library)
• Anne Van Camp (Smithsonian Institution)
8. Working Group Work Plan
• Research Phase (10/1/15 – 2/29/16)
• Design Phase (3/1/16 – 6/30/16)
• Writing Phase (7/1/16 – 9/15/16)
9. Findings
• “Collection” can mean many things.
• Collection data in DPLA is not consistent.
• Aggregate-level and item-level metadata are
both heavily used.
• Better contextual information (such as
collection descriptions) is a
reasonable way to improve access.
10. Description Level
Recommendations
• Recommend the use of aggregate-level
records when appropriate
• Recognize that minimal records, or
records that are highly redundant, are a
fact of life in archives.
• Collection and other context information in
those records will
reduce confusion
11. Context Recommendations
• Collection information that DPLA should
collect includes
– title
– description
– URL to a longer description
• There are several recommended methods
for gathering this information
12. J. Willis Sayre Collection of
Theatrical Photographs
The Sayre Collection consists of more than
24,000 photographs of theatrical and
vaudeville performers, musicians, and
entertainers who played in Seattle between
about 1900 and 1955 (some of the materials
date back to the 1870s). The collector was J.
Willis Sayre: drama critic, journalist, promoter.
The collection covers the little-studied period
when Seattle had a prominent place in the
development of vaudeville…
Learn more about the collection
See more items from the collection
13. Part of the WBAP-TV Archives Collection.
Collection
14. Search our Collections
Crossroads to Freedom
Digital Collection
David Rumsey Historical
Map Collection
Armed Forces History:
National Numismatic
Collection
Rhodes College. Crossroads to
Freedom
See all 116,001 items…
Learn more about the collection....
David Rumsey
See all 67,462 items…
Learn more about the collection...
National Museum of American History,
Kenneth E. Behring Center
See all 67,462 items…
Learn more about the collection...
15. Draft for Comment
• Deadline for a release of a draft for initial
comment is mid-September.
@dpla
facebook.com/
digitalpubliclibraryofamerica
http://dp.la
Hi, I’m Gretchen, Data Services Coordinator at the Digital Public Library of America. My role is to oversee metadata mapping and quality control. I work with our partner institutions all over the country to help them prepare their data for harvest by DPLA.
As many of you know, the Digital Public Library of America was created in order to increase access to our country’s online cultural heritage resources.
To accomplish this goal, the technological infrastructure developed for DPLA has been centered on the item-centric library model for description: one metadata record for each individual digital “object.”
While this method works very well for items catalogued singly, like books, it doesn’t translate to a lot of real-word descriptive models for cultural heritage materials. Archives and special collections have increasingly focused on aggregate-level description in digital objects as a way to facilitate access to digital objects online through an archival model, typically the online finding aid. This has been heavily influenced by the overall trend in the profession towards “More Product, Less Process” (MPLP) methodologies .
So a year ago I stood here and promised that DPLA would Deal with it.
So DPLA has formed an Archival Description Working Group to explore solutions to these issues.
We started working last October and are working on writing a whitepaper at the moment.
The working group represents a variety of both DPLA partners and external members.
So our goal for the whitepaper is to make two types of recommendations:
First, Recommendations for aggregated archival objects. So the aggregate-level would be an item for an entire folder of materials, for example, while the item level would be individual objects for every item in a folder with the same or remarkably similar description.
What structure (item- or aggregate-level) is useful in what situations
How should objects be displayed in context in an aggregation
Context for digital objects
What contextual information is useful for items
Where and how to get contextual information for items
We developed a work plan that would involve a pretty solid research phase,
Research Phase (10/1/15 – 2/29/16)
Literature Review
Environmental Scan
Summary/synthesis
That was followed by followed by a design phase where we talked about our actual recommendations (3/1/16 – 6/30/16)
User Scenarios
Metadata Analysis
UI Analysis
We are currently in the writing phase and are developing a whitepaper and some supporting documentation (7/1/16 – 9/15/16)
Through our research we actually had a number of findings, but I’ve boiled down the most relevant ones here:
Collection can mean many things. The working group made a decision early on to center our recommendations around the following types of collections: provenance-based archives and special collections and digital collections and exhibits. Basically we were cutting out collections created by convenience (all the content held a particular institution for example) and focusing on collections that were the result of some kind of collecting activity.
Collection data as it stands in DPLA is not consistently coherent or understandable. If this information is to be surfaced more prominently we need to work on remediation.
Aggregate-level and item-level approaches to metadata are still both very common in archival and library settings and there are benefits and drawbacks to each.
Better contextual information (such as collection descriptions) is a reasonable way to improve access
As I said, there are others and our whitepaper will go into some detail about all of the research
The recommendations we are going to put forward
Recommend the use of aggregate level records when appropriate, particularly when it reduces the number of items with highly redundant metadata
But we recognize that minimal records, or records that are highly redundant are a fact of life in archives based on a number of resource-based and historical factors.
In light of that, we recommend that
collection information in those records will reduce confusion and
going forward we recommend moving toward multi-object records in these situations.
Recommend inclusion of information about arrangement of item within context (i.e. a container list hierarchy) in a <relation> field of an item record, again when appropriate
We have maintained a very simple amount of required collection metadata for DPLA. I’m going to show some screenshots in a minute that will illustrate this, but we kept in mind the idea that DPLA is an aggregation of records that point back to their sources. It is not a database of extensive metadata or full text, it’s a pathfinder in a sense.
For that reason we wanted to keep the required metadata simple: just title, description and a URL to a longer description, like a home page or a finding aid on the web. This also follows the DPLA model for item metadata where every item record is required to have a link back to the item and record at it s original source.
There are multiple ways that this collection data could be gathered. We are recommending the use of a separate feed of collection records and identifiers in item records that point to them rather than including these collection descriptors in every item record. But we will be addressing both of those scenarios in the whitepaper.
This is what a current item record looks like in the DPLA portal.
Our recommendation then adds a sidebar with the collection title and description. There are additionally links to the collection home page and other items in the collection within DPLA.
If an item belongs to more than one collection, those would just be stacked here.
In the search results, we are recommending that collection name should be a facet-able metadata field, and collection names appear in the brief results
Finally, we are recommending a page where collection names and descriptions should be browse- and searchable. This is something we had a lot of conversation over. Currently there are more than 24,000 named collections in DPLA. As I noted earlier, looking further into these it would appear that many of them are the result of typos, mapping errors, and that many others reflect an organization of content that doesn’t seem like a useful collection (items organized by genre or institution for example).
In order for this recommendation to work, we are suggesting that the mappings be revised for collection and that institutions review and remediate their records. That effort will take some time. However, the change will essentially be opt-in. Institutions will be able to begin to provide this information for the collections they want to identify and if they don’t want to do that, they don’t have to.
Yes, this will be an extremely long list. We recognize that and are suggesting some ways to possibly search within or refine the list. But we feel that first, if the collection list is revisited the number could easily be reduced by as much as half or more, and that this is just one of many modes of access to these collections along with the item search and retrieval.
We are currently working on the draft of the paper and hope to have something out for comment next month. Please stay tuned to DPLA’s blog and twitter for details.