Some background and thoughts on Metadata Mapping and Metadata Crosswalks. A collection of online sources and related projects. Comments are more than welcome, as is reuse!
2. What are crosswalks?
• Crosswalks show people where to put the data
from one scheme into a different scheme. They
are often used by libraries, archives, museums,
and other cultural institutions to translate data
to or from MARC, Dublin Core, TEI, and
other metadata schemes.
source
3. One-way only
The process of translating from one schema to another is called
metadata mapping or field mapping [source]
Crosswalk from MARC to DC Crosswalk from DC to MARC
4. Mapping Problems
• Element A in Scheme A contains X values that
need to be split up into Element 1 and
Element 2 of Scheme B
• Element A in Scheme A can take more that
one values (multiplicity of n) whereas the
equivalent Element 2 in Scheme B, takes all
these values in a single field
5. Mapping Problems
• Different data formats across schemas (use of
names, other conventions, etc.)
• Element A in Scheme A is indexed but the equivalent
element in the other scheme is not
• Scheme A uses a different controlled vocabulary for
the same Element than Scheme B
6. “The more metadata experience we have, the more
it becomes clear that metadata perfection is not
attainable, and anyone who attempts it will be
sorely disappointed.
When metadata is crosswalked between two or
more unrelated sources, there will be data elements
that cannot be reconciled in an ideal manner. The
key to a successful metadata crosswalk is intelligent
flexibility. It is essential to focus on the important
goals and be willing to compromise in order to
reach a practical conclusion…“
"Metadata in Practice" Diane I. Hillmann and Elaine L. Westbrooks, eds.,
American Library Association, Chicago, 2004, p. 91.
7. Automated?
• Metadata Crosswalks can be automated, but
due to the complexity of metadata standards
and the extent of customization taking place,
only few general purpose automated
processes exist for crosswalks
8. Mapping between formats
• Excellent resource by Michael Day of UKOLN
– http://www.ukoln.ac.uk/metadata/interoperability/
Source
9. Metadata Element Set
• Two key components
– Semantics: Definitions of the meanings of the
elements
– Content: Declarations or instructions (or rules) of
what and how values should be assigned to
elements
10. Why map metadata?
• “Interoperability is the ability of multiple
systems with different hardware and software
platforms, data structures, and interfaces to
exchange data with minimal loss of content
and functionality”
NISO (National Information Standards Organization). (2004). Understanding metadata. Bethesda, MD: NISO
Press. Available: <http://www.niso.org/standards/resources/UnderstandingMetadata.pdf>.
11. Interoperability
…on a schema level
focusing on the elements of the schemas, being independent of
any applications. Derived element sets, encoded schemas,
crosswalks, application profiles, and element registries
…on a record level
focusing on integrating metadata records through the mapping
of the elements according to the semantic meanings of these
elements. Converted records and new records resulting from
combining values of existing records
12. Interoperability
…on a repository level
focusing on mapping value strings associated with particular
elements (terms associated with subject or format elements).
The results enable cross-collection searching
Source: http://www.dlib.org/dlib/june06/chan/06chan.html
13. Interoperability on the schema level
• This is achieved through:
– Derivation
• Using elements from existing schemas or standards, as
they are
– Application Profiling
• Localizing and optimizing schemata for specific contexts
– Metadata Crosswalks
• mapping elements, semantics, and syntax from one
metadata scheme to those of another
14. Interoperability on the schema level
• This is achieved through:
– Switching Across
• When trying to crosswalk among more schemas, using a
central one as a switch and crosswalking all to this one, is
easier
– Metadata Framework
• Either developing it based on existing schemas, or
establishing it before the development of schemas and
application profiles
– Metadata Registry
• Offering a centralized access point to existing schemas, to
facilitate the development of new ones and “foster”
interoperability
15. Crosswalking Approaches
• Absolute crosswalking
– You only match the elements that are 100%
equivalent and you ignore the rest
• Useful when mapping from a simpler to a more
complex schema
• Relative crosswalking
– You map all elements in a source schema to at
least one element of a target schema
• Useful when mapping from a complex to a simpler
schema
16. Three Meanings of Interoperability
• Semantic
– Semantic mapping is the process of analyzing the
definitions of the elements or fields to determine
whether they have the same or similar meanings
• Cultural
– presence of data models or wrappers that specify the
semantic schema being used
• Syntactic (technical)
– the ability to communicate, transport, store, and
represent metadata and other types of information
between and among different systems and schemas
Source
19. Fill Partner Request Form
Process Partner Request Form and
decide on viable aggregation route
Send Data
Exchange
Agreement (DEA)
Inform
aggregator and
liaise with
potential data
provider
Sign DEA and send to Europeana (data
providers or aggregators have to sign
with aggregator)
Send Data Contribution Form
Fill Data Contribution Form and send to
Europeana
Process Data Contribution Form to
enable first delivery of data
Delivery of data via OAI-PMH or FTP
sample or full datasets
(new data providers)
Feedback on metadata structure,
mandatory elements, rights statements
Delivery of ingest ready data: full
datasets (all data providers)
Feedback taken
into account
Check data
Feedback on
metadata
structure,
mandatory
elements, rights
statements
Ingestion of
datasets fully
compliant to
publication
policy
Publication of the submitted datasets in
Europeana
Action for data
provider or
aggregator
Action for
Europeana
Before 5th
of a month
Before 15th of a
month
Before 21st
of a month
Between 21st
and 30th
of a month
Between 10th
and 20th of
following month
Source: Europeana_Sounds
20. Metadata Operations
• Metadata Harvesting
– The process of collecting metadata descriptions of records
in an archive so that services can be built using metadata
from many archives [source]
• Metadata Validation
– The process of checking the structure of a metadata record
to define whether or not the record complies to a
predefined set of criteria
• Metadata Ingestion
– The process of bringing metadata records (and/or
content), into your system [source]
– i.e. You ingest metadata through harvesting [source]
21. Metadata Operations
• Metadata Transformation
– Converting a set of metadata values from the format of a source
system into the format of a destination system [source]
• Metadata Enrichment
– The process of adding metadata to an existing metadata record,
thus creating a new record, with added-value operations
• Metadata Publishing
– The process of making metadata data elements available to
external users, both people and machines using a formal review
process and a commitment to change control processes [source]
28. Step 7… … …Step 1.223.124
Harvesting
Ingestion
Mapping
Validation
Transformation
Enrichment
Publishing
Metadata are published on the
target repository and are offered
also through an OAI-PMH target
And round it goes!