SlideShare une entreprise Scribd logo
1  sur  37
Harvesting Using the
Open Archives Initiative Protocol:
What Can Your OAI Stream Tell You?
Sandra McIntyre, MWDL Director
Anna Neatrour, MWDL Digital Metadata Librarian
The basics

WHY OAI?
Open Archives
Initiative
Open Archives Initiative
http://openarchives.org
“Standards for Web Content
Interoperability”
• Facilitate the efficient dissemination of
content contained in archives/repositories
• Low-barrier framework and standards
Why is a protocol
necessary?

“Give me...”
“I want it.”

“I have it.”

OAI Harvester

OAI Provider

“Here is what you requested.”
OAI-PMH
Open Archives Initiative

Protocol for Metadata Harvesting
(OAI-PMH)
http://www.openarchives.org/pmh/
OAI Providers
OAI Providers
OAI Harvesters

Mountain West Digital Library
http://mwdl.org

OAIster
http://oaister.worldcat.org
and included in WorldCat

Digital Public Library
of America
http://dp.la/

Institute of Museum & Library Services
Digital Collections and Content
http://imlsdcc.grainger.uiuc.edu
...and thousands more
Harvesting at MWDL

Utah State
Archives

Utah
State
Library

Univ of
Nevada Las
Vegas

Univ of
Nevada
Reno

Utah Dvsn
Arts &
Museums

Salt Lake
Comm.
College

Arizona
Memory
Project

Snow
College
Northern
Arizona
Univ

Weber
State
Univ

Univ of
Idaho

Utah
State
Univ

Family
Search

Utah
Valley
Univ

LDS Church
History

Southern
Utah
Univ

Montana
Memory
Project

Stacks
(Idaho)

BYU

Univ of
Utah

Idaho
State
Archives

Mountain
West
Digital
Library

Boise State
Univ.
Why understand OAI?
• Predict what will happen with your
metadata when it is harvested
• Do self-auditing and/or peer auditing of
metadata: See patterns and find errors
Other metadata
harvesting options
• Handing over a hard drive
• Uploading/downloading via file transfer
protocol (FTP)
• Other requests of XML (typically
application programming interfaces,
APIs):
– Web Services
– X-Services
Advantages of OAI
• Update at a distance, anytime
• Specify desired records
– By collection
– By date range of last change to record

• Packets, one at a time
• Works fast
• Repeatable
Queries and responses

THE PROTOCOL
Queries and
Responses
OAI
query

OAI
Harvester

OAI
Provider

OAI
response
http://re.cs.uct.ac.za/

Testing an
OAI Provider
OAI
query

Queries:
OAI BaseURL

BaseURL = OAI provider root address
(Doesn’t work alone)

Examples:
• http://aura.abdn.ac.uk/dspace-oai/request
• http://absronline.org/journals/index.php/ind
ex/oai
• http://cyberleninka.ru/oai
• http://digitalcommons.usu.edu/cgi/oai2.cgi
• http://www.avhumboldt.net/oai/oai.php
OAI
query

Verb = type of request
Initial capitals; no spaces
Examples:
• Identify
• ListMetadataFormats
• ListSets
• ListIdentifiers
• ListRecords
• GetRecord

Queries:
6 Verbs
OAI
query

Queries:
Parameters & Values

Parameters & values = details about request
Format: parameter=value
Examples:
• metadataPrefix=oai_dc
• metadataPrefix=qdc
• set=awhof
• identifier=oai:content.lib.utah.edu:etd3/482
Queries you can use

EXAMPLES
Identify
“Who are you?”
http://contentdm.li.suu.edu/oai/oai.php?verb=Identify

OAI
query

OAI
Harvester

“I am the SUU CONTENTdm
Server Repository.”
OAI
response

OAI
Provider
Identify
“I am the SUU CONTENTdm Repository.”
ListSets
“What sets do you have available?”
http://contentdm.li.suu.edu/oai/oai.php?verb=ListSets

OAI
query

OAI
Harvester

OAI
Provider
“Here is the list of sets.”
OAI
response
“Here’s the list of sets.”

ListSets
ListMetadataFormats
“What metadata formats are available?”
http://contentdm.li.suu.edu/oai/oai.php?verb=ListMetadataFormats

OAI
query

OAI
Harvester

OAI
Provider
“Here’s the list of metadata formats.”
OAI
response
ListMetadataFormats
“Here’s the list of metadata formats.”
ListRecords
“Give me the metadata for all
records in qualified Dublin Core.”

http://contentdm.li.suu.edu/oai/oai.php?verb=ListRecords&
metadataPrefix=oai_qdc

OAI
query

OAI
Harvester

OAI
Provider
“Here are the records.”
OAI
response
ListRecords
“Here are the records.”
ListRecords
• One set only:
http://contentdm.li.suu.edu/oai/oai.php
?verb=ListRecords&metadataPrefix
=oai_qdc&set=hist_photos
• If more than one screen of records, use a
resumption token to get the additional lists (200 at a
time in this example):
http://contentdm.li.suu.edu/oai/oai.php
?verb=ListRecords&resumptionTok
en=hist_photos:200:hist_photos:0000-0000:9999-99-99:oai_qdc
GetRecord
• One record only:
http://contentdm.li.suu.edu/oai/oai.ph
p?verb=GetRecord&metadataPrefix
=oai_qdc&identifier=oai:contentdm.li.
suu.edu:hist_photos/0
CONTENTdm’s
OAI Provider
• Turning on OAI: Administrative interface in the “Server” tab
• Choosing which collections to share
• Sharing compound object level metadata only

Image from CONTENTdm OAI guide: http://contentdm.org/help6/server-admin/oai.asp
Record -> OAI
Local Record with Labels

OAI
OAI -> MWDL
OAI

MWDL
MWDL -> DPLA
MWDL

DPLA
Some Final Things
to Remember
• Check your own OAI stream and see what
it looks like!
– Mapped to none – not in OAI stream
– Hidden set to yes – not in OAI stream
– CONTENTdm field properties template and guide available
at: http://mwdl.org/getinvolved/getinvolved.php
– Login to collection admin, click on tab, go to fields to
check and edit properties
Field Mappings in
CONTENTdm

Field Mapping
example from the
Western Soundscape
Archive
Try it yourself!
Resources available at
http://mwdl.org/getinvolved/getinvolved.php
We’re here to help!
• For additional questions about
self-auditing your OAI contact
Anna Neatrour:
– anna.neatrour@utah.edu
– 801-587-8883

• Any Questions?

Contenu connexe

Tendances

PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013
Frauke Ziedorn
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
petrknoth
 

Tendances (20)

OCLC Linked Data Roundtable event IFLA 2012
OCLC Linked Data Roundtable event IFLA 2012OCLC Linked Data Roundtable event IFLA 2012
OCLC Linked Data Roundtable event IFLA 2012
 
Making the most of metadata Feb 2014 - BNB Linked Data Update
Making the most of metadata Feb 2014 - BNB Linked Data UpdateMaking the most of metadata Feb 2014 - BNB Linked Data Update
Making the most of metadata Feb 2014 - BNB Linked Data Update
 
PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013PIDs and DOI registration with DataCite - IATUL Workshop 2013
PIDs and DOI registration with DataCite - IATUL Workshop 2013
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
Access to Content via Link Resolvers
Access to Content via Link ResolversAccess to Content via Link Resolvers
Access to Content via Link Resolvers
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
Opening Up The BL's Metadata
Opening Up The BL's MetadataOpening Up The BL's Metadata
Opening Up The BL's Metadata
 
Using Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyUsing Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case study
 
CrossCheck iThenticate Admin Webinar
CrossCheck iThenticate Admin WebinarCrossCheck iThenticate Admin Webinar
CrossCheck iThenticate Admin Webinar
 
Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinking
 
Open Access Repository Junction
Open Access Repository JunctionOpen Access Repository Junction
Open Access Repository Junction
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDS
 
RDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an updateRDTF Metadata Guidelines: an update
RDTF Metadata Guidelines: an update
 
Enhancing Interoperability: The Implementation of OpenAIRE Guidelines and COA...
Enhancing Interoperability: The Implementation of OpenAIRE Guidelines and COA...Enhancing Interoperability: The Implementation of OpenAIRE Guidelines and COA...
Enhancing Interoperability: The Implementation of OpenAIRE Guidelines and COA...
 
SMRUDAS
SMRUDAS SMRUDAS
SMRUDAS
 
Clipper, research data network
Clipper, research data networkClipper, research data network
Clipper, research data network
 
The CIARD RINGValeri
The CIARD RINGValeriThe CIARD RINGValeri
The CIARD RINGValeri
 
Deep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCATDeep Impact: Metadata and SUNCAT
Deep Impact: Metadata and SUNCAT
 
Access the world’s research outputs through the CORE API
Access the world’s research outputs through the CORE API Access the world’s research outputs through the CORE API
Access the world’s research outputs through the CORE API
 

Similaire à Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream Can Tell You

Harnessing Free Content with Web Service APIs
Harnessing Free Content with Web Service APIsHarnessing Free Content with Web Service APIs
Harnessing Free Content with Web Service APIs
ALATechSource
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
Ken Karapetyan
 

Similaire à Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream Can Tell You (20)

Beyond openurl
Beyond openurlBeyond openurl
Beyond openurl
 
SHARE Update for CNI, Fall 2014
SHARE Update for CNI, Fall 2014SHARE Update for CNI, Fall 2014
SHARE Update for CNI, Fall 2014
 
BIS4440 Nov 2018
BIS4440 Nov 2018BIS4440 Nov 2018
BIS4440 Nov 2018
 
PDE4422 and PDE4400 Feb 2018
PDE4422 and PDE4400 Feb 2018PDE4422 and PDE4400 Feb 2018
PDE4422 and PDE4400 Feb 2018
 
CORE APIv3
CORE APIv3CORE APIv3
CORE APIv3
 
Next Generation Repositories
Next Generation RepositoriesNext Generation Repositories
Next Generation Repositories
 
Harnessing Free Content with Web Service APIs
Harnessing Free Content with Web Service APIsHarnessing Free Content with Web Service APIs
Harnessing Free Content with Web Service APIs
 
The Open Access Community, and OAIster
The Open Access Community, and OAIsterThe Open Access Community, and OAIster
The Open Access Community, and OAIster
 
BIS3400 Oct/Nov 2018
BIS3400 Oct/Nov 2018BIS3400 Oct/Nov 2018
BIS3400 Oct/Nov 2018
 
Closing the scientific literature access gap with CORE - how to gain free acc...
Closing the scientific literature access gap with CORE - how to gain free acc...Closing the scientific literature access gap with CORE - how to gain free acc...
Closing the scientific literature access gap with CORE - how to gain free acc...
 
PDE2440 and PDE4421 Oct 2018
PDE2440 and PDE4421 Oct 2018PDE2440 and PDE4421 Oct 2018
PDE2440 and PDE4421 Oct 2018
 
OAI-PMH
OAI-PMHOAI-PMH
OAI-PMH
 
What ami searching_hollis+articlestab
What ami searching_hollis+articlestabWhat ami searching_hollis+articlestab
What ami searching_hollis+articlestab
 
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...Enabling better science: Results and vision of the OpenAIRE infrastructure an...
Enabling better science: Results and vision of the OpenAIRE infrastructure an...
 
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...Enabling better science - Results and vision of the OpenAIRE infrastructure a...
Enabling better science - Results and vision of the OpenAIRE infrastructure a...
 
From Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsFrom Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and Collaborations
 
CST4599 Nov 2021
CST4599 Nov 2021CST4599 Nov 2021
CST4599 Nov 2021
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 

Plus de Sandra McIntyre

Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823
Sandra McIntyre
 

Plus de Sandra McIntyre (8)

Best Practices in Geospatial Metadata - Working Session at Digital Library Fe...
Best Practices in Geospatial Metadata - Working Session at Digital Library Fe...Best Practices in Geospatial Metadata - Working Session at Digital Library Fe...
Best Practices in Geospatial Metadata - Working Session at Digital Library Fe...
 
Overview of the Mountain West Digital Library (for Rhode Island's HELIN Proje...
Overview of the Mountain West Digital Library (for Rhode Island's HELIN Proje...Overview of the Mountain West Digital Library (for Rhode Island's HELIN Proje...
Overview of the Mountain West Digital Library (for Rhode Island's HELIN Proje...
 
Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823
 
Collaborating in the Mountain West: Access to Digital Resources and a Whole L...
Collaborating in the Mountain West: Access to Digital Resources and a Whole L...Collaborating in the Mountain West: Access to Digital Resources and a Whole L...
Collaborating in the Mountain West: Access to Digital Resources and a Whole L...
 
Harvesting and Normalization at the Digital Public Library of America: Lesson...
Harvesting and Normalization at the Digital Public Library of America: Lesson...Harvesting and Normalization at the Digital Public Library of America: Lesson...
Harvesting and Normalization at the Digital Public Library of America: Lesson...
 
MWDL Hosting Hubs Update: Services, Pricing, and Highlights
MWDL Hosting Hubs Update: Services, Pricing, and HighlightsMWDL Hosting Hubs Update: Services, Pricing, and Highlights
MWDL Hosting Hubs Update: Services, Pricing, and Highlights
 
Mountain West Digital Library as a Service Hub for the Digital Public Library...
Mountain West Digital Library as a Service Hub for the Digital Public Library...Mountain West Digital Library as a Service Hub for the Digital Public Library...
Mountain West Digital Library as a Service Hub for the Digital Public Library...
 
Welcome to the Mountain West Digital Library: The Power of Partnership
Welcome to the Mountain West Digital Library: The Power of PartnershipWelcome to the Mountain West Digital Library: The Power of Partnership
Welcome to the Mountain West Digital Library: The Power of Partnership
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Harvesting Using the Open Archives Initiative Protocol: What Your OAI Stream Can Tell You

  • 1. Harvesting Using the Open Archives Initiative Protocol: What Can Your OAI Stream Tell You? Sandra McIntyre, MWDL Director Anna Neatrour, MWDL Digital Metadata Librarian
  • 3. Open Archives Initiative Open Archives Initiative http://openarchives.org “Standards for Web Content Interoperability” • Facilitate the efficient dissemination of content contained in archives/repositories • Low-barrier framework and standards
  • 4. Why is a protocol necessary? “Give me...” “I want it.” “I have it.” OAI Harvester OAI Provider “Here is what you requested.”
  • 5. OAI-PMH Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) http://www.openarchives.org/pmh/
  • 8. OAI Harvesters Mountain West Digital Library http://mwdl.org OAIster http://oaister.worldcat.org and included in WorldCat Digital Public Library of America http://dp.la/ Institute of Museum & Library Services Digital Collections and Content http://imlsdcc.grainger.uiuc.edu ...and thousands more
  • 9. Harvesting at MWDL Utah State Archives Utah State Library Univ of Nevada Las Vegas Univ of Nevada Reno Utah Dvsn Arts & Museums Salt Lake Comm. College Arizona Memory Project Snow College Northern Arizona Univ Weber State Univ Univ of Idaho Utah State Univ Family Search Utah Valley Univ LDS Church History Southern Utah Univ Montana Memory Project Stacks (Idaho) BYU Univ of Utah Idaho State Archives Mountain West Digital Library Boise State Univ.
  • 10. Why understand OAI? • Predict what will happen with your metadata when it is harvested • Do self-auditing and/or peer auditing of metadata: See patterns and find errors
  • 11. Other metadata harvesting options • Handing over a hard drive • Uploading/downloading via file transfer protocol (FTP) • Other requests of XML (typically application programming interfaces, APIs): – Web Services – X-Services
  • 12. Advantages of OAI • Update at a distance, anytime • Specify desired records – By collection – By date range of last change to record • Packets, one at a time • Works fast • Repeatable
  • 16. OAI query Queries: OAI BaseURL BaseURL = OAI provider root address (Doesn’t work alone) Examples: • http://aura.abdn.ac.uk/dspace-oai/request • http://absronline.org/journals/index.php/ind ex/oai • http://cyberleninka.ru/oai • http://digitalcommons.usu.edu/cgi/oai2.cgi • http://www.avhumboldt.net/oai/oai.php
  • 17. OAI query Verb = type of request Initial capitals; no spaces Examples: • Identify • ListMetadataFormats • ListSets • ListIdentifiers • ListRecords • GetRecord Queries: 6 Verbs
  • 18. OAI query Queries: Parameters & Values Parameters & values = details about request Format: parameter=value Examples: • metadataPrefix=oai_dc • metadataPrefix=qdc • set=awhof • identifier=oai:content.lib.utah.edu:etd3/482
  • 19. Queries you can use EXAMPLES
  • 21. Identify “I am the SUU CONTENTdm Repository.”
  • 22. ListSets “What sets do you have available?” http://contentdm.li.suu.edu/oai/oai.php?verb=ListSets OAI query OAI Harvester OAI Provider “Here is the list of sets.” OAI response
  • 23. “Here’s the list of sets.” ListSets
  • 24. ListMetadataFormats “What metadata formats are available?” http://contentdm.li.suu.edu/oai/oai.php?verb=ListMetadataFormats OAI query OAI Harvester OAI Provider “Here’s the list of metadata formats.” OAI response
  • 25. ListMetadataFormats “Here’s the list of metadata formats.”
  • 26. ListRecords “Give me the metadata for all records in qualified Dublin Core.” http://contentdm.li.suu.edu/oai/oai.php?verb=ListRecords& metadataPrefix=oai_qdc OAI query OAI Harvester OAI Provider “Here are the records.” OAI response
  • 28. ListRecords • One set only: http://contentdm.li.suu.edu/oai/oai.php ?verb=ListRecords&metadataPrefix =oai_qdc&set=hist_photos • If more than one screen of records, use a resumption token to get the additional lists (200 at a time in this example): http://contentdm.li.suu.edu/oai/oai.php ?verb=ListRecords&resumptionTok en=hist_photos:200:hist_photos:0000-0000:9999-99-99:oai_qdc
  • 29. GetRecord • One record only: http://contentdm.li.suu.edu/oai/oai.ph p?verb=GetRecord&metadataPrefix =oai_qdc&identifier=oai:contentdm.li. suu.edu:hist_photos/0
  • 30. CONTENTdm’s OAI Provider • Turning on OAI: Administrative interface in the “Server” tab • Choosing which collections to share • Sharing compound object level metadata only Image from CONTENTdm OAI guide: http://contentdm.org/help6/server-admin/oai.asp
  • 31. Record -> OAI Local Record with Labels OAI
  • 34. Some Final Things to Remember • Check your own OAI stream and see what it looks like! – Mapped to none – not in OAI stream – Hidden set to yes – not in OAI stream – CONTENTdm field properties template and guide available at: http://mwdl.org/getinvolved/getinvolved.php – Login to collection admin, click on tab, go to fields to check and edit properties
  • 35. Field Mappings in CONTENTdm Field Mapping example from the Western Soundscape Archive
  • 36. Try it yourself! Resources available at http://mwdl.org/getinvolved/getinvolved.php
  • 37. We’re here to help! • For additional questions about self-auditing your OAI contact Anna Neatrour: – anna.neatrour@utah.edu – 801-587-8883 • Any Questions?

Notes de l'éditeur

  1. NEEDS GRAPHIC
  2. See this in slideshow view to see the animations!
  3. Registered at http://openarchives.org
  4. Open Access repositories of scholarly communications materials
  5. SANDRA
  6. NEED GRAPHICS
  7. NEED GRAPHICS
  8. NEEDS GRAPHIC
  9. MOVE UPFind your base URLAdd to OAI pageUpdate OAI page with information about base URL for other platforms.OmekaContentdmBePressMWDL Harvesting Log – example to wind up and complete process what primo is doing with OAINormalization routines are runCounter examples – mapping that is wrong May not have set enabled for OAIMetadata formats associated with OAI. Dublin core among othersOAI provider may or may not be configured to provide qualified dublin core
  10. awhof = Arizona Women’s Hall of Fame
  11. ANNA
  12. Use Identify to make sure that the OAI provider is set up and working. This is a great query to use if you are uncertain of the OAI provider URL for your digital asset management system and want to test it to be sure.
  13. This is the information that is returned from an identify query. You will see here we have the repository name, and also the administrator/contact information for the person who administers the server.
  14. ListSets asks what sets are available for harvest. This is a great thing to check yourself to make sure that all the collections are enabled for harvest that you want, or if you have a digital collection with some sort of restriction like on-campus access only, you can check to make sure that it isn’t available for harvest.
  15. The set spec or alias for each collection is listed. If you have a new collection that you want to be added to MWDL, the set spec is one of the pieces of information I’ll need in order to get the harvesting set up.
  16. What metadata formats are available?
  17. Here you will notice that both simple dublin core or qualified dublin core are available from the SUU server. MWDL prefers to harvest in qualified dublin core if possible.
  18. In real life if you are playing around with OAI queries in your browser, you might not run this, because it gives you all of the records from the available collections in qualified dublin core. That’s a lot of records! This is typical of the type of request that MWDL would make to harvest records, in whatever type of batch the server is set up to share.
  19. Here we can see some records coming in from SUU. I can see the set spec hist_photos and go down and see the first record coming in, including all the descriptive information that is made available for that record.
  20. I like to check the OAI for one set at a time when I’m checking out metadata to make sure that it matches up with the MWDL Dublin Core Application profile. This is something you can do too, if you want to do a quick check to make sure that all of the required fields are showing up correctly. You can also look at one record at a time by using the identifier associated with that record.
  21. I like to check the OAI for one set at a time when I’m checking out metadata to make sure that it matches up with the MWDL Dublin Core Application profile. This is something you can do too, if you want to do a quick check to make sure that all of the required fields are showing up correctly. You can also look at one record at a time by using the identifier associated with that record.
  22. CONTENTdm’s OAI provider can be accessed from the server tab, then click on harvesting to see the controls. Here, we’d want enable OAI set to YES! This is also where I could look up the base URL for my repository, if I wasn’t sure what it was. I could change the name of the repository to something more descriptive, include server admin e-mails. I would want to leave enable compound object pages set to “no”. If that’s enabled, all the individual pages would be harvested as single objects. MWDL would then end up with thousands of items called “page 1” or “page 2”.By default, if no collections are specified everything is published. You might run into a situation where you want to expose some collections but not others for harvesting, in which case, you would need to add the set spec or alias for each collection that should be harvested.
  23. Here we can take a look at what a record with local field labels looks like vs the same record’s information in OAI. Notice how the local field labels disappear, so the classification information from the Western Soundscape Archive is all mapped to dc:description.
  24. Repeated fields are merged into one in the MWDL. For example, the local record had multiple contributors listed, this information is now in one field. The source record also had separate rights statements for the creator of the sound recording of an animal and the creator of a photo of the animal. These statements are now in one field.
  25. Here we can see the same record with slightly different information displayed in MWDL and DPLA. DPLA has different normalizing routines, for example the designation of the digital collection associated with the record as Western Soundscape Archive isn’t in the DPLA record, but people can still click through and view that information at the source record.
  26. You can do some self-auditing to make sure that everything in your local collection is displaying in the manner in which you would like it to be harvested. We have a CONTENTdm field properties template and guide that you can use to help make sure everything is set up correctly.
  27. Western Soundscape archive field properties. See where things have been set to no for “hide” and mapped to dublin core. If some of these fields were unmapped or set to hidden, they would not appear as harvestable in the OAI for the collection.
  28. We have an OAI queries page with quick links to try for everything we went over during this presentation. This is also where you can find the CONTENTdm field properties template.
  29. Thanks for participating in the webinar today! If you have any follow-up questions please feel free to contact me!