2. Questions for Today
Why did we decide to do this?
How did we do it?
What is this, really?
What does this look like?
Where do we go from here?
ALA Dallas 1/20/12
3. Why Did We Do This?
We‟re very interested in the problems
of mapping in the Semantic Web
◦ See paper:
http://dcevents.dublincore.org/index.php/I
ntConf/dc-2011/paper/view/52
We wanted to explore these issues in
the context of MARC21
Others doing maps had cherry-picked
available properties—we wanted to do
them all
ALA Dallas 1/20/12
4. How Did We Do It?
OMR has lacked upload capability—
that was built for this purpose by Jon
Phipps, OMR developer
Gordon Dunsire built all the
spreadsheets in Google
Docs, including the URI patterns
I did most of the testing and some of
the error handling
Not rocket science, but a solid few
weeks of work
ALA Dallas 1/20/12
5. What Is It Good For?
It‟s the beginning, not the ending …
There‟s more work to be done, but we
wanted people to be able to use it for
experimentation right away
We included *almost* everything—
over 10,600 properties and vocabulary
concepts in all
It can be downloaded in pieces, we
haven‟t yet set up an „everything‟
download capability (we will)
ALA Dallas 1/20/12
6. MARC21 in the Open Metadata
Registry
Separate RDF property available
◦ For every attribute in the fixed-data fields
Based on character position
◦ For every combination of indicators and
subfield codes within a tag
Including non-filing indicators
Some tags excluded (for now)
RDF/SKOS vocabularies (with
definitions!) available for all code lists
for the fixed-data fields
ALA Dallas 1/20/12
7. MARC21 in the Open Metadata
Registry
There‟s a consistent pattern used to
coin marc21rdf.info URIs
Designed to enable a knowledgeable
cataloger to „guess‟ the correct URI…
ALA Dallas 1/20/12
8. MARC21 in the Open Metadata
Registry
URIs for properties based on the
MARC21 encoding
◦ “M” + tag + indicator 1 + indicator 2 +
subfield code
Underscore substituted for blank
E.g. M24011d
◦ Date of treaty signing in Uniform Title, with
1 nonfiling character and intended to be
printed or displayed
ALA Dallas 1/20/12
9. MARC21 in the Open Metadata
Registry
URIs for concepts based on the
MARC21 codes
◦ E.g. a (code for playback speed of a
sound recording)
◦ 16 rpm
Each vocabulary and concept has its
own URI
◦ E.g. soundrecordingspd#
◦ E.g. soundrecordingspd#a (for 16 rpm)
ALA Dallas 1/20/12
10. MARC21 in the Open Metadata
Registry
Data from MARC21 legacy records can
be mechanically-mapped to the
corresponding RDF properties at “Level
0”
Level 0 properties can be mapped to a
series of more generic properties (Level
1 and above)
◦ To clarify and separate semantics from
syntax
◦ Relatively simple for fixed-data fields, but
significant issues to be tackled for variable
tags. ALA Dallas 1/20/12
11. MARC21 in the Open Metadata
Registry
Generic properties can be used to
map from MARC21 to other metadata
schemas/formats
◦ Example goal: from MARC21 title (245
$a) to Dublin Core title, ISBD title
proper, RDA manifestation title, etc.
MARC21 data can be retained in RDF
without loss of semantics or
values, and made interoperable with
data from other sources via semantic
relationships
ALA Dallas 1/20/12
12. RDF graph of 00X mapping pattern for an attribute with more than one value
and no significance in the order of values
ALA Dallas 1/20/12
13. Partial RDF graph of 00X mapping pattern for two or more attributes with
more than one value and significance in the order of values
ALA Dallas 1/20/12
14. RDF graph of 00X data from National Library of Scotland record for
Legacy by Robert Buchanan
ALA Dallas 1/20/12
15. Partial RDF graph of 00X data from OCLC record for Abbey Road by the Beat
ALA Dallas 1/20/12
19. What‟s Left To Do?
Build on the „Level 0‟ representations
to better reflect the reality and
complexity of MARC21
Continue to explore the usefulness of
the process we used, and how it might
be used for other element sets
Write it up for the benefit of others
New DCMI Vocabulary Management
Community seems a good place for
this activity
ALA Dallas 1/20/12
20. Questions?
Contact for Diane:
metadata.maven@gmail.com
Contact for Gordon
gordon@gordondunsire.com
◦ (the middle section of slides were his)
Please let us know what you think!
ALA Dallas 1/20/12
Notes de l'éditeur
This is an example of one of the mapping patterns identified for MARC21 00X attributes. In this case, up to four codes for illustrations can be used for specific categories of resource, and there is no significance in the order of the codes.This is an RDF graph of the mapping. The four Level 0 properties corresponding to the attributes are mapped as sub-properties of a generic attribute for the category of material, which in turn is mapped as a sub-property of a generic attribute for all categories of material.
This is another example of a mapping pattern for MARC21 00x attributes where up to four codes can be used for information about the relief of a cartographic resource. The order of the codes and the attributes used is significant, with the most prominent form of relief recorded in the first attribute, and the fourth most prominent, if any, recorded in the last attribute.This is a partial RDF graph of the mapping, showing only the first Level 0 attribute in each category of resource. Each is mapped as a sub-property of a generic attribute for the category of material and also as a sub-property of a generic attribute for the first, most prominent attribute. Each generic attribute in turn is mapped as a sub-property of a generic attribute for all categories of material.The graph omits the mappings from the second, third, and fourth Level 0 attributes.
This is an example of using the Level 0 properties for MARC21 00X attributes for a real record.The record contains only a 008 field, and this is a complete RDF graph of corresponding triples.Note the graph omits the Type of record attribute from the Leader field, which corresponds to Form of material attribute from field 006. This treatment of the Leader field is still under investigation.The graph uses controlled vocabularies from the Library of Congress Authorities at id.loc.gov as well as those specific to the 00X attributes available from the Open Metadata Registry.Note the inclusion of the data for Conference publication, Festschrift, and Index. Many mappings would simply exclude these when they have a non-positive value. But the Semantic Web has an “Open World Assumption” that the absence of a particular statement just means, in principle, that the statement has not been made explicitly yet. Publishing the explicit statements that, in this case, the resource is not a conference publication or festschrift, and has no index, avoids a user of the data having to make the assumption.
This is another example of using the Level 0 properties for MARC21 00X attributes for a real record.The record contains one 006, two 007 and one 008 fields, and this is a partial RDF graph of corresponding triples.Note the inclusion of the “not applicable” data for Form of composition of Music. Again, it is useful to publish this triple to avoid invoking the Open World Assumption.