Controlled Vocabularies & Cataloging

Making the Findable Findable: Controlled Vocabularies
| Robin Fay | @georgiawebgurl | 2019

Week
1
 Robin Fay, Instructor & Consultant
 Resources
 slideshare.net/robinfay
 robinfay.net

Metadata experts curate and
assign topics (subjects) to items
 Subject indexing is used in information
retrieval especially to create bibliographic
indexes to retrieve documents on a
particular subject…Index terms were
mostly assigned by experts but author
keywords are also common. – Wikipedia
 So that users can find what they need…

The role of metadata experts
 Metadata
experts
(catalogers)
carefully
review
resources
 They
carefully
describe the
items in the
collection

 They assign
subjects from
controlled
vocabularies
 They create
metadata in
the form of
records, that
meet specific
standards

 Metadata
workers and
catalogers
primarily focus
on descriptive
metadata,
which describes
the item itself.
 This description
makes the item
findable!

Other types of metadata include
administrative (rights and system created
information often fall here) and structural (parts
to the whole, etc.) Catalogers may work with
any and all metadata!

Book example with a specific metadata
schema (set of rules)

Recording information, providing context
What does a
cataloger look
for when
describing an
item?

Metadata experts
 Title – from the item itself
transcribed by the cataloger,
created by the cataloger (e.g.,
Photographs which have no
title), or harvested via OCR – this
may be the title page (the first
page in the book that serves as
the official record of it) or for
movies, what appears on the
title screen

Metadata experts
 Creators: Who created it
(authors, editors, etc.) and who
participated in it (illustrators,
translators, contributors, for
media: performers, directors,
etc.)
 Creators are linked to an
authorized (official) form of their
name

Metadata experts
 Descriptions about the format
and extent (what the file format
is – is it a text PDF, an image in a
JPG format, etc.
 How many pages, how big the
file is, whether it has illustrations
are not)

Metadata experts
 What type of item it is. Is it a
movie, book,
magazine/journal, database,
photograph (among many
others)?

Metadata experts
 A summary of what the work
is about
 If multiple items are
represented in one work, the
metadata may include a
table of contents or a list of
the parts
 Notes about the item itself

Metadata experts
 Subjects, not just keywords
 Catalogers and other
metadata workers often
select appropriate subjects
off a list. This list is called a
controlled vocabulary.
 We’ll explore controlled
vocabularies later.

Metadata experts
 Specific coding to assist
library catalogs and systems
in finding materials
 End goal is for users to find
what they are looking for!

Determining the subject
Catalogers create this sort
of record using templates,
forms, & tools.
They create bibliographic
data.
Putting all of that
together…

What is bibliographic data?
As we have seen, it is data that describes library resources (Biblio
= book graphic = of or for writing, a painting, loosely information
about )

MARC display
Which is actually a coded
record that is not typically
seen except by other
metadata or system
workers, or the system
itself.
This is one presentation of
bibliographic data. Some
bibliographic data looks
more web markup
languages.

Subject indexing is used
in information
retrieval especially to
create bibliographic
indexes to retrieve
documents on a particular
subject…Index terms were
mostly assigned by experts
but author keywords are
also common. – Wikipedia
So that our users can find
what they need…
Subject indexing (analysis) –
the Process

Catalogers and cataloging
(metadata description)
 Create human readable description and contextual
data (e.g., title, descriptions) - (we’ve already talked
about the description aspect of cataloging)
AND
 Create machine specific data and code all data in a
specific way to help the system understand the data –
(we briefly explored the coding aspect)
AND
 Select appropriate controlled vocabulary terms
(subjects) from a list to guide users to resources
matching their terms (or in the case they use an
alternate term, facilitating finding resources)

Metadata experts
 After creating the description
of the item itself, the cataloger
will need to determine the
best subject headings and/or
call number.
 All of that will need to be
coded into a record.

Determining the subject
Subject Analysis typically involves a cataloger carefully
selecting subjects from a defined list. Metadata experts
curate and assign topics (subjects) to items.

Subject Analysis – the process
 The process of indexing begins with any analysis of
the subject of the document.
 The indexer must then identify terms which
appropriately identify the subject either by
extracting words directly from the document (OCR,
metadata harvest of user submitted content;
relevance of words; an automated process) OR
assigning words from a controlled
vocabulary.[1] The terms are then presented in a
systematic order.

Subject Analysis – the process
 In library metadata, the first
subject heading (a controlled
vocabulary term) is the most
relevant and often tied to a call
number (if one exists)
 Call numbers are used for
shelving and to group like items
together, typically by subject.
This Photo by Unknown Author is licensed under CC BY-NC

ROBIN FAY @GEORGIAWEBGURL
Controlled vocabularies
• Controlled vocabularies are
a defined list of terms.
• They identify the preferred
term for a specific concept
• The most common
controlled vocabularies
used in libraries having a
hierarchy, which guides
users to more broad or
specific terms.

Controlled terms/Subjects
Controlled terms selected from a
controlled vocabulary. Hierarchy is
hard to see in this list.

MARC display
Selected subjects
from a controlled
vocabulary
Coding tells the
system which
controlled
vocabulary and
what type of
heading.
Coding tells the
system which
controlled
vocabulary and
what type of
heading.
Subdivisions provide more granularity (more
refinements) = more precisions
These follow specific rules for the LC SH
(Subject Headings) controlled vocabulary
Coding (e.g., $x) provides the system with
additional information

Why have experts
create metadata?
Can’t AI do all the
work?

While catalogers
(metadata experts) have
been doing this work for a
LONG time (before
electronic records even
existed!), AI can not yet
make human decisions
and provide context.
Metadata experts curate and assign
topics (subjects) to items

Metadata experts curate and assign
topics (subjects) to items
 Catalogers analyze resources. They can make
determinations such as
 These 2 authors are actually the same person
although there are 2 different names (name
change, pseudonym, etc.)
 The keywords or subject terms used by the
author match to a specific term in the controlled
vocabulary (even when they are not the same
as what appears in the controlled vocabulary)
 What is the most important topic in a work.| Robin Fay | @georgiawebgurl | 2019

 Catalogers standardize metadata and provide a
consistency of display.
 Catalogers provide consistency in the display of
information (especially true when the original
metadata has been created from a digital object)
 Proper names, acronyms, etc. – Catalogers interpret
and standardize data as needed.
 Account for variations in titles (The adventures of
Romeo & Juliet vs. Romeo and Juliet: a tragedy vs
Romeo & Juliet)
 Account for typos by the author or publisher

Catalogers provide context.
Provide contextual information
(e.g., author is not listed on the
item but is known)
Describe items more fully that AI
can often do; not all information
that is included in a record
actually appears in or on an item.
Highlights information that is more
relevant. | Robin Fay | @georgiawebgurl | 2019

Although people sometimes call subject headings “keywords”
there is a difference.
Keywords are free text, meaning that any term or phrase
can be entered. Keywords are also used in social media
#hashtag that…
Subject Headings vs Keywords

Keywords can be a form of a
natural language query (e.G.,
Asking google a question as you
would ask a person). Natural
language is an uncontrolled
group of words put together. AI
and search algorithms can
sometimes infer context or
compare the query to others, to
determine the best resources.

Catalogers do use some
natural language in
records – in notes they
write or other free text
fields.

Because they are from a defined list, Subject
Headings can be linked to a central record
(authority record) to pull together all content with
that specific term, creating more precision
searching.
Keywords are varied with no central record tying all
records to a specific term.
Some systems do link all records with the same
keyword together without creating a central record.

Examples:
keyword searching challenges
Above all, don’t flush! : adventures in
valorous living. (a biography of a single
parent)
 Let’s rejoin the human race! (a work on
retirement)
1
6

Rules to guide precision of searching
 Assign headings only for topics that
comprise at least 20% of the work.
 Assign 1 subject heading, but typically
no more than 6. 10 is the hard limit for
LC’s guidelines for assigning terms
from the LC Subject Headings
Controlled Vocabulary. (libraries do
sometimes deviate from this practice)
 The first heading is the most relevant.
 There are guidelines!
6

 So, subject headings
selected by a
cataloger
Guide a searcher to
standardized
terminology
Guide a searcher to
more resources in
the same topic
(whether or not they
include the topic as
a keyword)
6

Carefully curated
headings mean that
the most relevant
resources as well as
related resources are
presented.
With a hierarchical
controlled vocabulary,
searchers can easily
navigate between
related resources at
more broad or narrow
topics.
6

Where do
catalogers
look for clues?
Where do
catalogers
look for
clues to a
resource’s
content?
6
This Photo by Unknown Author is licensed under CC BY-SA

Determining the subject content
Title
Table of contents
Introduction or
preface
Author’s purpose or
foreword
Abstract
Summary
Index
Illustrations,
diagrams
Containers
7

What gets considered?
 Topics (Subjects)
 Names of:
 Persons
 Corporate bodies
 Geographic areas
 Time periods
 Titles of works
 Form of the item
(Type of item, Style,
etc)
 Genre (works with a
theme, e.g., comedy
films)
8

1
0
Important factors: Objectivity
Catalogers must give an accurate, unbiased
indication of the contents of an item (neutral)
Assess the topic objectively, remain open-minded
Consider the author’s intent and the audience
Avoid personal value judgments
Give equal attention to works, including:
Topics you might consider frivolous
Works with which you don’t agree
This Photo by Unknown Author is
licensed under CC BY

Examples
The big lie : the Pentagon plane crash that never happened / Thierry
Meyssan.
 American Airlines Flight 77 Hijacking Incident, 2001
 Terrorism -- Government policy -- United States
The silent subject : reflections on the unborn in American culture /
edited by Brad Stetson.
 Abortion -- Moral and ethical aspects -- United States
 Abortion -- Social aspects == United States
 Fetus -- Moral and ethical aspects
1
1

The problem with bias
 Bias is persistent in many
controlled vocabularies
 It reflects the community of
use (the AA of AACR2 =
Anglo American although
the cataloging rules were
used internationally!) and
culture norms at the time a
heading is formed
47
This Photo by Unknown Author is
licensed under CC BY-NC-ND
Library data is also impacted by
the card environment and the
challenge of typing/updating
subject headings on cards

Bias
48 Like any system based on
language, LCSH reflects the
cultural context of its users,
with the intent of being
unbiased.
 In situations in which
longstanding headings
reflect cultural biases of the
past, or where common
usage changes significantly,
headings are being revised
based on perceived need
and available resources.
However, this does not
always happen.

SearchingKeywords
Controlled
vocabularies

OPAC example
Controlled
vocabularies

Information
anywhere,
any way,
any time
51
This Photo by Unknown Author is licensed under CC BY
This Photo by Unknown Author is licensed under CC BY-NC-ND This Photo by Unknown Author is
licensed under CC BY-SA
This Photo by Unknown Author is licensed under CC BY-SA-NC
This Photo by Unknown Author is licensed under
CC BY
So about searchers &library users…

This Photo by Unknown Author is licensed under CC BY
This Photo by Unknown Author is licensed under CC BY-NC-ND This Photo by Unknown Author is
licensed under CC BY-SA
This Photo by Unknown Author is licensed under CC BY-SA-NC
How can catalogers address different user
needs? How can catalogers facilitate
finding, identifying, obtaining, and selecting,
to ensure a successful search process, the
successful completion of User Tasks??
52Library searchers & users..

How do catalogers facilitate
users finding what they
need? (the FRBR User
Tasks):
Briefly, they are find, identify,
select, obtain, and explore.
Find: resources corresponding
to user’s search criteria
Identify: confirm resource
described corresponds to
that sought, or distinguish
between more than one
resource with similar
characteristics
Select: resource appropriate
to user’s needs
Obtain: to acquire or access
resource
Explore: Find related resources
(serendipity)
53
Robin Fay2019
FRBR User Tasks

Data has given us more optionsMobile, social, personalized through data, connected-
how do catalogers fit into this world of automated data?
54

So how do catalogers into this new world?
Aligning with modern data practices, working with automation,
and providing additional context and analysis to meet the
needs of users (FRBR User Tasks)
 Updated cataloging rules that are flexible, agnostic of
metadata schemas, and address digital objects better
Updated views of authorities (think>identity management)
and subjects
 Different format for library data that is more semantic web
friendly (i.e., not MARC) – in progress
 Capitalize on other communities’ data and reuse/share data
more than data entry – in progress

Robin Fay 2019 @georgiawebgurl
57

Controlled Vocabularies & Cataloging

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Controlled Vocabularies & Cataloging

Similaire à Controlled Vocabularies & Cataloging (20)

Plus de robin fay

Plus de robin fay (20)

Dernier

Dernier (20)

Controlled Vocabularies & Cataloging