SlideShare une entreprise Scribd logo
1  sur  35
Patricia Feeney
Metadata Quality Coordinator
Managing your
metadata quality
Agenda
I. Metadata quality audit
II. DOI registration
III. Conflicts overhaul (discussion)
IV. Metadata Quality tools
Best query ever -> bad metadata = match
Mediocre query -> bad metadata = match
Horrible query -> bad metadata = match
Best query ever -> good metadata = match ✓+
Mediocre query -> good metadata = match
(probably) ✓
Horrible query -> good metadata = match (maybe)
✓-
Metadata Quality Audit: Overview
Accurate and complete metadata is vital to querying and citation
linking.
If the metadata for a DOI is incorrect, incomplete, or messy, a
match can't be made, regardless of the quality of a query.
Current efforts include:
 Reports
Resolution report (emailed monthly)
depositor report (on website)
crawler (on website)
field report (on website)
conflict report (on website, emailed
monthly)
schematron reports (emailed weekly)
failed query report (on website)
DOI error reports (emailed daily)
 Contact members individually (as
issues arise)
 Documentation and communication
Metadata Quality Audit
A Metadata Quality Audit will:
 provide publishers with detailed feedback on the
quality of their metadata by identifying problem areas
 identify members who need attention
 provide motivation and support to members with
metadata issues
The intent of the audit is to provide information, but there
may be consequences for extreme abusers.
Audit Scope
I. DOI resolution
II. Conflicts
III. Overall metadata
quality
IV.Metadata
maintenance Hello, I’d
like to audit
you
Great, lets
get started! Hooray!
Level I: DOIs that have been distributed but not deposited
and resolve to the Handle error page. *
Level II: DOIs resolving to an error page *
Level III: DOIs with response page blocked by access
control
Level IV: DOIs that resolve to an inadequate response
page.
I. DOI Resolution
* actionable transgressions
II. Conflicts
Conflicts occur when two (or more) DOIs are
deposited with identical metadata.
Level I: conflicts created between members *
Level II: conflicts within a publisher prefix(es) *
Level III: conflicts created due to insufficient metadata
+
Level IV: conflicts created due to item/content type +
* actionable transgressions
+ this may change, more later
Quality of deposited metadata
I. Missing metadata: is all available metadata
deposited?
II. Accuracy: is metadata correct?
III. Unusual metadata: does metadata fit into the
correct content type?
IV. Overall quality: is metadata messy?
Maintenance
I. Gaps in coverage - this usually indicates
undeposited DOIs (very very bad)
II. Currency of deposits - are deposits made ahead
of DOIs being distributed?
III. Title maintenance - less of a problem with recent
title restrictions, but we still have problems, title
abbreviations
IV. Reference linking compliance
Actionable Areas
DOI Resolution:
Level I (Undeposited DOIs)
Level II (DOIs resolving to error page)
 If action is not taken within a reasonable time period (TBD), DOIs
will be registered on behalf of the member (eventually for a fee)
 Continual distribution of unregistered DOIs may affect membership
Conflicts:
Level I conflict created between members
Level II conflicts within a publisher prefix
 A $2 per DOI conflict penalty fee may be imposed for conflicts of this
type if they are not resolved within a reasonable time period (TBD).
Metadata Maintenance:
Outbound linking compliance
 members found to not be linking during the audit will be subject to
non-linking penalties
Audit Process
1. Notification:
Auditees will be informed
of pending audit;
data collection begins (1-2
weeks for most members)
2. Data delivery
Audit document will be
emailed to member for review
2 weeks prior to audit (longer
if necessary); audit scheduled
3. Audit
phone conference, follow-up
scheduled (if necessary)
4. Response
member/CR reconvene to
discuss progress on audit
findings
5. Follow-up
(if necessary)
Questions?
II. DOI Registration Pilot
DOIs should without exception be registered before
they are released to the public.
Most DOIs resolve, but the ones that don’t are a big
problem.
Solution: we’re going to register them*
*(ideal solution: publisher registers them)
DOI selection: At the moment, we will register DOIs
reported by end users, using the DOI error report as
a source.
DOI error report:
Implemented mid-2008
~4,000 DOI errors reported
monthly
> 1,400 fixed monthly through
publisher deposits
Some of the unfixed DOIs are
not ‘real’ DOIs, but many are.
We will register DOIs that meet the following criteria:
 Have been distributed publicly by the
publisher/prefix owner
 Have an identifiable response page
 Have been reported to the publisher’s technical
and business contacts
DOI Registration Process
1. DOI reported: a user reports an unresolving DOI
using the DOI error form
2. Technical contact notified (DOI error report email)
3. CrossRef review: CR staff reviews reported DOIs and
expires DOIs that do not meet our registration criteria
4. Business contact notified: 2 weeks from the initial
report, business contact is notified of remaining valid
unregistered DOIs.
5. CR deposit: after 2 weeks have passed from business
contact notification, CrossRef will register any
undeposited DOIs.
Questions?
Conflicts overhaul
Conflicts occur when two (or more) DOIs share
the same metadata, suggesting two DOIs are
assigned to a single item.
Why are conflicts bad?
 Only one DOI should be assigned per item
 Queries will return multiple DOIs, causing
confusion
 Some queries (OpenURL) may not return a
DOI if multiple results are present
 Conflicts between two DOIs often result in one
of the DOIs being neglected***
We currently have ~200,000+ conflicts in our
system. Not all of them are a problem:
 For some items, our schema only allows
minimal metadata
 Some content types require matching
metadata (standards and book chapters with
minimal metadata (dictionaries) for example)
Legitimate conflicts
Conflict between 2 prefixes:
http://dx.doi.org/10.1639/0044-7447(2001)030[0037:IOPOFU]2.0.CO;2
http://dx.doi.org/10.1579/0044-7447-30.1.37
Sample query
Conflict within 1 prefix:
http://dx.doi.org/10.3724/SP.J.1006.2008.00070
http://dx.doi.org/10.3724/SP.J.1006.2008.00770
Journal Title Year Vol Issue Page Author Article Title
AMBIO 2001 30 1 37 Köhlin Impact of Plantations on Forest Use a...
Journal Title Year Vol Iss Page Author Article Title
ACTA AGRONOMICA
SINICA
2008 34 5 770 Zhang Differential Gene Expression in
Upper…
‘Bad’ conflicts
Conflicts with minimal metadata:
10.1002/ijc.11095
10.1002/ijc.11093
Conflict due to content type:
10.1520/C0506-10 10.1520/C0506-10A
10.1520/C0506-10B
Journal Title Year Vol Issue Page Author Article Title
International Journal of Cancer 2003 104 6 798 Errata
Book Title Year Editi
on
Page Author Title
Specification for
Reinforced Concrete...
2010 2010 C13
Committee
Elements considered during
conflict generation:
 Content type
 Journal, book and/or series
title
 Article title /content_item title
(book chapters)
 Publication year
 Volume
 Issue
 First page
 Author
 Edition
If there is a match between all
deposited elements, a conflict is
generated.
2 Items with matching journal
title, volume, issue, and article
title will cause a conflict.
Ideas?
What should our minimum set of metadata
be?
How should conflicts be
monitored/reported?
Managing your
metadata quality
Sample #1: incorrect metadata
Q: My link resolver is retrieving the wrong metadata for DOI
10.1002/rra.1288, causing our links to break - here is my
query*:
http://www.crossref.org/openurl?pid=pfeeney@crossref.org&aulast=Null&
title=River Research and
Applications&volume=26&issue=6&page=663&year=2010
*query metadata matches the response page metadata
A: Two problems with deposited metadata (DOI query):
#1 <year media_type="print">2009</year>
#2 <pages>
<first_page>n/a</first_page>
<last_page>n/a</last_page>
</pages>
Sample #2: messy metadata
Q: I know DOI 10.1068/p6742 exists, why doesn’t my query
work?
A: Let’s check the guest query form
Metadata for article:
Newport R, Preston C, 2010, "Pulling the finger off disrupts agency, embodiment and
peripersonal space" Perception 39(9) 1296 – 1298
Problem is: author surname is deposited as:
<person_name sequence="first" contributor_role="author">
<given_name>Roger</given_name></given_name>
<surname><surname>Newport</surname></surname>
</person_name>
Sample #3: duplicate authors
Q: Why does DOI 10.2307/1382491 have multiple versions of
the same author?
A: attempt to improve query matching
<contributors>
<person_name sequence="first" contributor_role="author">
<given_name>Erling Johan</given_name>
<surname>Solberg</surname>
</person_name>
<person_name sequence="additional" contributor_role="author">
<given_name>Bernt-Erik</given_name>
<surname>Sæther</surname>
</person_name>
<person_name sequence="additional" contributor_role="author">
<given_name>Bernt-Erik</given_name>
<surname>Saether</surname>
</person_name>
</contributors>
New(ish) tools for managing
metadata and deposit problems
Schema documentation:
http://www.crossref.org/schema/documentation/ or linked
from help doc
Reporting problems / asking for help:
 Help documentation (http://www.crossref.org/help/)
 Support portal and forums (http://support.crossref.org)
 Contact support@crossref.org
Schematron update
Schematron reports notify depositors of non-fatal
deposit issues
 35-40 emails sent out weekly
 Alerts are generated for < 1% of deposits
 Tend to identify ‘messy’ deposits
 Rules updated periodically
Schematron Warnings
page number
contains
underscore
2%
first page
contains dash
4%
last page
contains dash
7%
Jr.' in surname
61%
punctuation in
surname
26%
Jr. in surname:
Araújo Jr
Prata Jr.
Szezech Jr.
Punctuation in surname:
(Earven) Tribble
Frederick (Frikkie) J.
Arch Marin march@ub.edu
Plauchu********
Other rules:
 ‘ed’ ‘iss’ ‘vol’ in edition,
issue, volume elements
 Publication year exceeds
current year by >2
 Surname / title all upper
case
Questions?
support@crossref.org
pfeeney@crossref.org

Contenu connexe

Similaire à Managing Your Metadata Quality 2010 CrossRef Workshops

Leonard&Dhollander_OpenScienceBelgium.pptx
Leonard&Dhollander_OpenScienceBelgium.pptxLeonard&Dhollander_OpenScienceBelgium.pptx
Leonard&Dhollander_OpenScienceBelgium.pptxOpenAccessBelgium
 
Crossref LIVE US Online
Crossref LIVE US OnlineCrossref LIVE US Online
Crossref LIVE US OnlineCrossref
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDSFrauke Ziedorn
 
Finding The Perfect Donor Database In An Imperfect World
Finding The Perfect Donor Database In An Imperfect WorldFinding The Perfect Donor Database In An Imperfect World
Finding The Perfect Donor Database In An Imperfect World4Good.org
 
Hands On Database 2nd Edition by Steve Conger Solution Manual
Hands On Database 2nd Edition by Steve Conger Solution ManualHands On Database 2nd Edition by Steve Conger Solution Manual
Hands On Database 2nd Edition by Steve Conger Solution Manualrochidavander
 
IBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-Bending
IBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-BendingIBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-Bending
IBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-BendingLuis Guirigay
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentTasktop
 
System Update 2010 Annual Meeting
System Update 2010 Annual MeetingSystem Update 2010 Annual Meeting
System Update 2010 Annual MeetingCrossref
 
Week Four JournalObjectiveYou will identify key ethical proble.docx
Week Four JournalObjectiveYou will identify key ethical proble.docxWeek Four JournalObjectiveYou will identify key ethical proble.docx
Week Four JournalObjectiveYou will identify key ethical proble.docxalanfhall8953
 
Information Architecture Guidelines (SharePoint) - Innovate Vancouver.pdf
Information Architecture Guidelines (SharePoint) - Innovate Vancouver.pdfInformation Architecture Guidelines (SharePoint) - Innovate Vancouver.pdf
Information Architecture Guidelines (SharePoint) - Innovate Vancouver.pdfInnovate Vancouver
 
Discussion Explore and Discuss a Research StudyReview the Probl.docx
Discussion Explore and Discuss a Research StudyReview the Probl.docxDiscussion Explore and Discuss a Research StudyReview the Probl.docx
Discussion Explore and Discuss a Research StudyReview the Probl.docxmadlynplamondon
 
E-Commerce Product Rating Based on Customer Review
E-Commerce Product Rating Based on Customer ReviewE-Commerce Product Rating Based on Customer Review
E-Commerce Product Rating Based on Customer ReviewIRJET Journal
 
How to successfully grow a code review culture
How to successfullygrow a code review cultureHow to successfullygrow a code review culture
How to successfully grow a code review cultureDanylenko Max
 
Acquia Drupal Certification
Acquia Drupal CertificationAcquia Drupal Certification
Acquia Drupal CertificationPhilip Norton
 
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)jiscpowr
 
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...MongoDB
 
Open Data standards day - IODC 16 - Simple open data practices by euroalert
Open Data standards day - IODC 16 - Simple open data practices by euroalertOpen Data standards day - IODC 16 - Simple open data practices by euroalert
Open Data standards day - IODC 16 - Simple open data practices by euroalertJose Luis Marín de la Iglesia
 

Similaire à Managing Your Metadata Quality 2010 CrossRef Workshops (20)

Leonard&Dhollander_OpenScienceBelgium.pptx
Leonard&Dhollander_OpenScienceBelgium.pptxLeonard&Dhollander_OpenScienceBelgium.pptx
Leonard&Dhollander_OpenScienceBelgium.pptx
 
Crossref LIVE US Online
Crossref LIVE US OnlineCrossref LIVE US Online
Crossref LIVE US Online
 
Roberts "Data-driven discovery to delivery: A publisher's perspective"
Roberts "Data-driven discovery to delivery: A publisher's perspective"Roberts "Data-driven discovery to delivery: A publisher's perspective"
Roberts "Data-driven discovery to delivery: A publisher's perspective"
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDS
 
Finding The Perfect Donor Database In An Imperfect World
Finding The Perfect Donor Database In An Imperfect WorldFinding The Perfect Donor Database In An Imperfect World
Finding The Perfect Donor Database In An Imperfect World
 
Hands On Database 2nd Edition by Steve Conger Solution Manual
Hands On Database 2nd Edition by Steve Conger Solution ManualHands On Database 2nd Edition by Steve Conger Solution Manual
Hands On Database 2nd Edition by Steve Conger Solution Manual
 
IBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-Bending
IBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-BendingIBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-Bending
IBM ConnectED 2015 - BP103: Solving the Weird, the Obscure, and the Mind-Bending
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics Environment
 
System Update 2010 Annual Meeting
System Update 2010 Annual MeetingSystem Update 2010 Annual Meeting
System Update 2010 Annual Meeting
 
Week Four JournalObjectiveYou will identify key ethical proble.docx
Week Four JournalObjectiveYou will identify key ethical proble.docxWeek Four JournalObjectiveYou will identify key ethical proble.docx
Week Four JournalObjectiveYou will identify key ethical proble.docx
 
Information Architecture Guidelines (SharePoint) - Innovate Vancouver.pdf
Information Architecture Guidelines (SharePoint) - Innovate Vancouver.pdfInformation Architecture Guidelines (SharePoint) - Innovate Vancouver.pdf
Information Architecture Guidelines (SharePoint) - Innovate Vancouver.pdf
 
Sap business objects interview questions
Sap business objects interview questionsSap business objects interview questions
Sap business objects interview questions
 
Discussion Explore and Discuss a Research StudyReview the Probl.docx
Discussion Explore and Discuss a Research StudyReview the Probl.docxDiscussion Explore and Discuss a Research StudyReview the Probl.docx
Discussion Explore and Discuss a Research StudyReview the Probl.docx
 
E-Commerce Product Rating Based on Customer Review
E-Commerce Product Rating Based on Customer ReviewE-Commerce Product Rating Based on Customer Review
E-Commerce Product Rating Based on Customer Review
 
How to successfully grow a code review culture
How to successfullygrow a code review cultureHow to successfullygrow a code review culture
How to successfully grow a code review culture
 
Acquia Drupal Certification
Acquia Drupal CertificationAcquia Drupal Certification
Acquia Drupal Certification
 
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
Web Preservation in a Web 2.0 Environment (Brian Kelly, UKOLN)
 
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
Bangalore Executive Seminar 2015: Case Study - Text Analysis on MongoDB for a...
 
manage databases like codebases
manage databases like codebasesmanage databases like codebases
manage databases like codebases
 
Open Data standards day - IODC 16 - Simple open data practices by euroalert
Open Data standards day - IODC 16 - Simple open data practices by euroalertOpen Data standards day - IODC 16 - Simple open data practices by euroalert
Open Data standards day - IODC 16 - Simple open data practices by euroalert
 

Plus de Crossref

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021 Crossref
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Crossref
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowCrossref
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...Crossref
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolCrossref
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...Crossref
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionCrossref
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...Crossref
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaCrossref
 
crossmark update
crossmark updatecrossmark update
crossmark updateCrossref
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020Crossref
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020Crossref
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarCrossref
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK OnlineCrossref
 

Plus de Crossref (20)

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to know
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en español
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de Investigacion
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, Indonesia
 
crossmark update
crossmark updatecrossmark update
crossmark update
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK Online
 

Dernier

Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 

Dernier (20)

Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

Managing Your Metadata Quality 2010 CrossRef Workshops

  • 1. Patricia Feeney Metadata Quality Coordinator Managing your metadata quality
  • 2. Agenda I. Metadata quality audit II. DOI registration III. Conflicts overhaul (discussion) IV. Metadata Quality tools
  • 3. Best query ever -> bad metadata = match Mediocre query -> bad metadata = match Horrible query -> bad metadata = match Best query ever -> good metadata = match ✓+ Mediocre query -> good metadata = match (probably) ✓ Horrible query -> good metadata = match (maybe) ✓- Metadata Quality Audit: Overview Accurate and complete metadata is vital to querying and citation linking. If the metadata for a DOI is incorrect, incomplete, or messy, a match can't be made, regardless of the quality of a query.
  • 4. Current efforts include:  Reports Resolution report (emailed monthly) depositor report (on website) crawler (on website) field report (on website) conflict report (on website, emailed monthly) schematron reports (emailed weekly) failed query report (on website) DOI error reports (emailed daily)  Contact members individually (as issues arise)  Documentation and communication
  • 5. Metadata Quality Audit A Metadata Quality Audit will:  provide publishers with detailed feedback on the quality of their metadata by identifying problem areas  identify members who need attention  provide motivation and support to members with metadata issues The intent of the audit is to provide information, but there may be consequences for extreme abusers.
  • 6. Audit Scope I. DOI resolution II. Conflicts III. Overall metadata quality IV.Metadata maintenance Hello, I’d like to audit you Great, lets get started! Hooray!
  • 7. Level I: DOIs that have been distributed but not deposited and resolve to the Handle error page. * Level II: DOIs resolving to an error page * Level III: DOIs with response page blocked by access control Level IV: DOIs that resolve to an inadequate response page. I. DOI Resolution * actionable transgressions
  • 8. II. Conflicts Conflicts occur when two (or more) DOIs are deposited with identical metadata. Level I: conflicts created between members * Level II: conflicts within a publisher prefix(es) * Level III: conflicts created due to insufficient metadata + Level IV: conflicts created due to item/content type + * actionable transgressions + this may change, more later
  • 9. Quality of deposited metadata I. Missing metadata: is all available metadata deposited? II. Accuracy: is metadata correct? III. Unusual metadata: does metadata fit into the correct content type? IV. Overall quality: is metadata messy?
  • 10. Maintenance I. Gaps in coverage - this usually indicates undeposited DOIs (very very bad) II. Currency of deposits - are deposits made ahead of DOIs being distributed? III. Title maintenance - less of a problem with recent title restrictions, but we still have problems, title abbreviations IV. Reference linking compliance
  • 11. Actionable Areas DOI Resolution: Level I (Undeposited DOIs) Level II (DOIs resolving to error page)  If action is not taken within a reasonable time period (TBD), DOIs will be registered on behalf of the member (eventually for a fee)  Continual distribution of unregistered DOIs may affect membership Conflicts: Level I conflict created between members Level II conflicts within a publisher prefix  A $2 per DOI conflict penalty fee may be imposed for conflicts of this type if they are not resolved within a reasonable time period (TBD). Metadata Maintenance: Outbound linking compliance  members found to not be linking during the audit will be subject to non-linking penalties
  • 12. Audit Process 1. Notification: Auditees will be informed of pending audit; data collection begins (1-2 weeks for most members) 2. Data delivery Audit document will be emailed to member for review 2 weeks prior to audit (longer if necessary); audit scheduled 3. Audit phone conference, follow-up scheduled (if necessary) 4. Response member/CR reconvene to discuss progress on audit findings 5. Follow-up (if necessary)
  • 14. II. DOI Registration Pilot DOIs should without exception be registered before they are released to the public. Most DOIs resolve, but the ones that don’t are a big problem. Solution: we’re going to register them* *(ideal solution: publisher registers them)
  • 15. DOI selection: At the moment, we will register DOIs reported by end users, using the DOI error report as a source.
  • 16.
  • 17. DOI error report: Implemented mid-2008 ~4,000 DOI errors reported monthly > 1,400 fixed monthly through publisher deposits Some of the unfixed DOIs are not ‘real’ DOIs, but many are.
  • 18. We will register DOIs that meet the following criteria:  Have been distributed publicly by the publisher/prefix owner  Have an identifiable response page  Have been reported to the publisher’s technical and business contacts
  • 19. DOI Registration Process 1. DOI reported: a user reports an unresolving DOI using the DOI error form 2. Technical contact notified (DOI error report email) 3. CrossRef review: CR staff reviews reported DOIs and expires DOIs that do not meet our registration criteria 4. Business contact notified: 2 weeks from the initial report, business contact is notified of remaining valid unregistered DOIs. 5. CR deposit: after 2 weeks have passed from business contact notification, CrossRef will register any undeposited DOIs.
  • 21. Conflicts overhaul Conflicts occur when two (or more) DOIs share the same metadata, suggesting two DOIs are assigned to a single item.
  • 22. Why are conflicts bad?  Only one DOI should be assigned per item  Queries will return multiple DOIs, causing confusion  Some queries (OpenURL) may not return a DOI if multiple results are present  Conflicts between two DOIs often result in one of the DOIs being neglected***
  • 23. We currently have ~200,000+ conflicts in our system. Not all of them are a problem:  For some items, our schema only allows minimal metadata  Some content types require matching metadata (standards and book chapters with minimal metadata (dictionaries) for example)
  • 24. Legitimate conflicts Conflict between 2 prefixes: http://dx.doi.org/10.1639/0044-7447(2001)030[0037:IOPOFU]2.0.CO;2 http://dx.doi.org/10.1579/0044-7447-30.1.37 Sample query Conflict within 1 prefix: http://dx.doi.org/10.3724/SP.J.1006.2008.00070 http://dx.doi.org/10.3724/SP.J.1006.2008.00770 Journal Title Year Vol Issue Page Author Article Title AMBIO 2001 30 1 37 Köhlin Impact of Plantations on Forest Use a... Journal Title Year Vol Iss Page Author Article Title ACTA AGRONOMICA SINICA 2008 34 5 770 Zhang Differential Gene Expression in Upper…
  • 25. ‘Bad’ conflicts Conflicts with minimal metadata: 10.1002/ijc.11095 10.1002/ijc.11093 Conflict due to content type: 10.1520/C0506-10 10.1520/C0506-10A 10.1520/C0506-10B Journal Title Year Vol Issue Page Author Article Title International Journal of Cancer 2003 104 6 798 Errata Book Title Year Editi on Page Author Title Specification for Reinforced Concrete... 2010 2010 C13 Committee
  • 26. Elements considered during conflict generation:  Content type  Journal, book and/or series title  Article title /content_item title (book chapters)  Publication year  Volume  Issue  First page  Author  Edition If there is a match between all deposited elements, a conflict is generated. 2 Items with matching journal title, volume, issue, and article title will cause a conflict.
  • 27. Ideas? What should our minimum set of metadata be? How should conflicts be monitored/reported?
  • 29. Sample #1: incorrect metadata Q: My link resolver is retrieving the wrong metadata for DOI 10.1002/rra.1288, causing our links to break - here is my query*: http://www.crossref.org/openurl?pid=pfeeney@crossref.org&aulast=Null& title=River Research and Applications&volume=26&issue=6&page=663&year=2010 *query metadata matches the response page metadata A: Two problems with deposited metadata (DOI query): #1 <year media_type="print">2009</year> #2 <pages> <first_page>n/a</first_page> <last_page>n/a</last_page> </pages>
  • 30. Sample #2: messy metadata Q: I know DOI 10.1068/p6742 exists, why doesn’t my query work? A: Let’s check the guest query form Metadata for article: Newport R, Preston C, 2010, "Pulling the finger off disrupts agency, embodiment and peripersonal space" Perception 39(9) 1296 – 1298 Problem is: author surname is deposited as: <person_name sequence="first" contributor_role="author"> <given_name>Roger</given_name></given_name> <surname><surname>Newport</surname></surname> </person_name>
  • 31. Sample #3: duplicate authors Q: Why does DOI 10.2307/1382491 have multiple versions of the same author? A: attempt to improve query matching <contributors> <person_name sequence="first" contributor_role="author"> <given_name>Erling Johan</given_name> <surname>Solberg</surname> </person_name> <person_name sequence="additional" contributor_role="author"> <given_name>Bernt-Erik</given_name> <surname>Sæther</surname> </person_name> <person_name sequence="additional" contributor_role="author"> <given_name>Bernt-Erik</given_name> <surname>Saether</surname> </person_name> </contributors>
  • 32. New(ish) tools for managing metadata and deposit problems Schema documentation: http://www.crossref.org/schema/documentation/ or linked from help doc Reporting problems / asking for help:  Help documentation (http://www.crossref.org/help/)  Support portal and forums (http://support.crossref.org)  Contact support@crossref.org
  • 33. Schematron update Schematron reports notify depositors of non-fatal deposit issues  35-40 emails sent out weekly  Alerts are generated for < 1% of deposits  Tend to identify ‘messy’ deposits  Rules updated periodically
  • 34. Schematron Warnings page number contains underscore 2% first page contains dash 4% last page contains dash 7% Jr.' in surname 61% punctuation in surname 26% Jr. in surname: Araújo Jr Prata Jr. Szezech Jr. Punctuation in surname: (Earven) Tribble Frederick (Frikkie) J. Arch Marin march@ub.edu Plauchu******** Other rules:  ‘ed’ ‘iss’ ‘vol’ in edition, issue, volume elements  Publication year exceeds current year by >2  Surname / title all upper case

Notes de l'éditeur

  1. …so we’re going to being auditing publishers, audit in the sense of providing a detailed, personalized, hopefully helpful review of a publisher’s metadata quality.We’re also welcoming the opportunity to improve our own processes, whether it be by refining reports or creating new tools to help with deposits.
  2. Conflicts are identified:upon deposit in the submission logsin the monthly conflict report(We’ll discuss conflicts at length later) We have four levels of conflicts as well – Level I: conflicts created between members * These typically happen when a title is acquired by a new publisher and the publisher creates new DOIs for already-published content without checking first to see whether or not DOIs have already been created. In these cases the originally deposited DOIs are usually left untended and aren’t updated, leading to a ton of level II DOIs.Level II: conflicts within a publisher prefix(es) *Some members create conflicts with themselves – either they change their DOI suffix conventions and decide to apply the change retroactively, or through negligence, or ??? I’m not sure of other reasons but it happens fairly often so there must be some.These types of conflicts are most likely going away with the new system (more later):Level III: conflicts created due to insufficient metadata +Level IV: conflicts created due to item/content type +
  3. We’re also going to review overall deposited metadata quality, which many members will find most useful.Missing metadata: is all available metadata deposited?One of the most common questions I get from new publishers is ‘I’ve deposited a few issues but can’t retrieve my what am I doing wrong? And the answer almost always is ‘deposit more metadata’ II. Accuracy – is metadata correct?This isn’t a big problem across our membership but when it’s a problem, it’s a problem. Again, it’s one of those things that’s hard to identify – we rely on reports from end users, usually librarians. Big accuracy problems include 1. incorrect ISSNsfor a title, different title variants(these will be managed differently in the new system, hooray), 2. depositing an item online then not updating to include print metadata and 3. author name spelling / order - this isn’t a rampant problem but we do hear from a lot of authors when their names are incorrectIII. Unusual metadata: again, not a huge problem, but some non-standard publications get shoehorned into our available content types. Sometimes it’s unavoidable, but we might have a better approachIV. Overall quality: is metadata messy?This means malformed special characters, non-essential markup in titles, dashes in page ranges, surnames in given name element, organization names in surname instead of &lt;organization&gt;, including ‘Jr.’ in the surname
  4. The final category is maintenance – usually deposited metadata requires very little maintenance if it is sound to begin with, beyond updating URLs of course. There are a few things to cover, however:Gaps in coverage: Many of you have gaps in coverage – you submit an issue for deposit, the deposit fails, you don’t notice…usually the end result is long-term undepositedDOIsAgain, currency of deposits – are deposits made ahead of DOIs being released into the wild? These are usually short-term undepositedDOIs and are eventually deposited but they are damaging as wellTitle maintenance – this process will change with the new system but we’ll still need to do some curating of titles. One thing we’ll focus on in the audit is abbreviations attached to your titles as they help with query matching.Reference linking compliance
  5. Our hope is that once systemic problems are pointed out to publishers these measures won’t be necessary, but we will take action if we have to.
  6. (broken record) As mentioned previously, undepositedDOIs are increasingly becoming a problem. DOIs should without exception be registered before they are released to the public. An unregistered DOI is more than a string of characters – it’s a source of frustration for researchers, librarians, authors – anyone trying to link to your content.DOIsshould be registered before they’re released to the public. Most are, but most isn’t good enough, so we’re starting a pilot project to register DOIs that meet certain criteria.
  7. End users find undepositedDOIs very frustrating.
  8. All attempts to resolve unregistered DOIs resolve to this page. We worked with CNRI to create this page a few years ago – prior to that, the error page had CNRI’s email address – CNRI would forward DOI errors to us as they were reported, and we’d send them along to publishers. This form is a big improvement. Instead of emailing CNRI, users can just submit a form so instead of a few hundred complaints a week we get over a thousand. A user can submit just the DOI, but the form also collects the referring page (admittedly not always useful, depending on the page), any notes, and allows the user to enter their email address. We send them an alert when/if the DOI is registered.
  9. The DOI error report is a big improvement over the past email-based method, and allowed us to compile data which made the unregistered DOI problem more obvious. Currently the DOI error report is the best source for data on unregistered DOIs.An average of 4,000 DOI errors are reported monthly. Less than 1,400 are fixed monthly through publisher deposits. Some of the unfixed DOIs aren’t legitimate DOIs, but many are - We don’t have specific data yet on what percentage of legitimate DOIs remain unregistered at the end of a month but we should once our project has been under way for a month or so.
  10. We’ll only register DOIs that have been distributed publicly by the publisher/prefix owner – if for some strange reason someone is generating DOIs with your prefix then not depositing them (it happens, usually inadvertently – someone reverses digits in a prefix, for example) you won’t be liable. We likewise won’t register DOIs that have been misrepresented in areas beyond your control. Have an identifiable response page – we will verify that the DOI has been distributed, we won’t register something just because a user says it existsHave been reported to the publisher’s technical and business contacts – we notify you well in advance of registration
  11. Things to note:This is a pilot project so we’re still refining the process – most of the refining should be on our end and not affect members. The initial email that went out to business contacts didn’t have the ‘final’ list of DOIs to be registered but any emails going out now do.There will ultimately be a charge for this process, and it will be significant enough that you wont’ want to use the DOI error form instead of the web deposit form to create DOIs – its not the most effective, efficient or accurate way to register DOIs.
  12. ***this is a big problem
  13. Despite our efforts the number of conflicts continues to grow.Conflicts as they are work well to identify conflicts between journal article DOIs and most book DOIs, but there are shortcomings in the process:a.) journal items such as letters to the editor, book reviews, erratum etc. often generate conflictsb.) book chapters, standards, and reports can generate bogus conflicts in some situationsOverall, it makes the conflict process not as useful as it should be.
  14. Sample #1 – publisher has deposited all relevant metadata and DOIs are assigned to two distinct items, but a conflict was createdSample #2 – DOIs for standards can generate conflicts - they often have very similar metadata despite being distinct items
  15. Note: We did offer for many years the option to include a flag in your metadata that would allow you to bypass conflict creation – this feature was revoked because some very large publishers were including the flag in *all* deposits, creating conflicts that couldn’t (and still can’t) be identified.Possible questions: minimum metadata set? (Do we require a certain amount of metadata to be present before generating a conflict?)How should these be monitored (in submission log, maintain current conflict reports?)Are ‘conflicts’ a valid concept? Should we reject deposits with conflicts (once we have refined our conflict identification methods obviously)Other ideas?
  16. Note: the metadata for these DOIs will (hopefully) be corrected soon – once the metadata is corrected the queries used in these examples will resolve.
  17. In this example, multiple versions of authors have been deposited to improve query matching. This creates problems when metadata is retrieved for display.Solution: CR will work on improving things on our end – our system does do some character matching already.