NISO Webinar; "Enabling Discovery, Part Two: Publishers and Libraries Talk Metadata & Monographs"; January 16, 2019
https://www.niso.org/events/2019/01/enabling-discovery-part-two-publishers-and-libraries-talk-metadata-monographs
3. Inthenotso
distantpast…
There were two main options when searching for ebooks:
1. Search each individual vendor’s website/database
2. Load MARC records (one record for each title) into the
catalog for each vendor
3
4. Inthenotso
distantpast…
Problems with this approach:
Loading records is a LOT of work and requires regular
maintenance
Massaging/editing/enhancing metadata; loading;
updates; replacements; deletes
Number of records/titles to load
Lack of records available for loading
Records come from numerous places and each vendor
requires a different procedure to download files
Tracking titles in multiple places (duplicate work)
4
5. Now:more
options…
1. Search each individual vendor’s website/database
2. Load MARC records (one record for each title) into the
catalog for each vendor
3. Integration of various vendors metadata into
discovery layers via APIs and linked data rather than
importing records into the catalog
4. Federated search tools that index multiple databases
(e.g. unified index search tools)
…but are more options better?
5
6. Thegoodand
thebad
GOOD:
fewer places to search (possibly even only one)
most public libraries, while they have other ebook
databases, will have a single integrated discovery layer
BAD:
MORE places to search
BUT discovery is still a challenge no matter which search
option you choose, and those challenges are centered
around:
METADATA
6
16. Differences
defined
Differences in description
Current vs past rules and guidelines;
RDA provider neutral vs individual vendor records
Differences between vendors for same title
Differences in how data is entered/presented
Record proliferation
Related to metadata differences: records cannot be
“collapsed” because the discovery layer doesn’t recognize
them as the same
Different vocabularies and identity databases
16
18. Whydothese
differences
matter?
How people search
Keyword - forces dependency on keyword indexes
Follow links - if you click on the subject search for
Obama, Michelle, search results include only print books
(no ebooks)
Limits/facets - dependent on metadata, both visible
and invisible (coded)
Missing metadata
Discovery layer exposes ALL the metadata (good, bad,
missing)
All means items get “hidden” because they’re not
findable.
18
19. How dowefix
it?
CONSISTENCY
use of controlled vocabularies and existing authority
databases (name matching, subjects, etc.)
Use existing metadata sources
Follow standards and recommended/best practices
Communication
Data points
complete
consistency across vendors
19
Usually differences are a GOOD thing, providing diversity; but not in this case
Caveat: speaking from a public library perspective mainly; although most of the issues public libraries have are present in academic environments; differences are resource types and focus on currency/popularity of materials (collection is more ephemeral than permanent)
BUT my background is serials and nonprint format cataloging – been dealing with managing metadata/cataloging for ejournals and ebooks for almost 2 decades now
My philosophy: job of cataloging/metadata is to make stuff findable, which includes unique identification of resources
I don’t believe in the “perfect” record
If it’s not wrong, leave it alone (don’t delete data, just exclude it from indexes…you may want it in the future)
When editing:
Fix errors or delete if wrong
Add access points
Enhance content/description (add value)
Make it pretty
Number of vendors increased – more complex more time
Each vendor: different procedure for downloading; different edits (some need proxy added, some don’t); files may be in various formats and require conversion to MARC
Tools to help streamline (MarcEdit – TASK LISTS saving the edits for each vendor are a savior)
BUT still very time consuming
Multiple places: ERM and the Catalog and possibly the vendor website – have to keep in sync
Looking at a single search option for ebooks and print books, where an API is used to search both ebook vendor and the catalog in one search
So lets look at examples – examples are current popular titles or authors
Who’s watching the show on Netflix?
ISBN: this is often a key match point for OpenURL resolvers or other API/linked data tools
Title: ebook version is incomplete
Author: translator is missing, an issue when looking for a specific translation or if searching by translator name
Date format – indexing issue – how does your system handle dates?
ISBN: this is often a key match point for OpenURL resolvers or other API/linked data tools
Title: ebook version is incomplete
Author: indexing issues; identity management/authority control issues
Date format – indexing issue – how does your system handle dates?
ISBN: this is often a key match point for OpenURL resolvers or other API/linked data tools
Subject: where’s DC??
Title: ebook version is different
Author: indexing issues; identity management/authority control issues
Date format – indexing issue – how does your system handle dates?
Do you see a trend yet?
Description:
AACR2 vs RDA – fundamental change in how you approach describing a resource
Provider neutral – one records for ALL online versions of a title (formats, platform, etc.) – just have multiple links/URLs to various options; Hard to do that with APIs/linked data tools
Date format, author format (last, first or first last?)
Proliferation: more vendors = more records
We get patron complaints about ebook display all the time
Different vocabularies and identity databases – name formats, subjects, locations, etc. Creates indexing and filing issues; split indexes
Missing: sometimes records just don’t appear – API/linked data tool errors, delays,
Data changes: records get “out of sync” – print book may be complete but ebook is still minimal/prepublication
Branding: can’t add custom text to create collections, or other data to ebook records; limits to control over display and what data is included – stuck with what the vendor sends/makes available
Forcing dependency on keyword indexing or indexing of the WHOLE records – specific author indexes, etc. become not useful
How people search: Subjects/identities – FORM matters
“see also”
Collections
Links – find something the want/like, follow links to “similar” or “like” items using subjects, authors, etc. (internet rabbit hole…)
Limits/facets – such as format, publication date, location, etc.
Missing metadata – subjects, ISBN, names, locations, etc.; lose match points; may result in records not appearing – search ISBN and the ebooks don’t show up
Discovery layers – good at exposing EVERYTHING (great way to identify database cleanup projects…)
Communication – between libraries and vendors
Data points – more is better, even if they don’t display