Metadata: An Overview for Digital Collections

•Télécharger en tant que PPTX, PDF•

0 j'aime•80 vues

Overview of the metadata role in resource description, resource discovery and website faceting. The presentation discusses metadata consistency, granularity and types (descriptive, administrative and structural) with emphasis on technical and preservation metadata. The presentation introduces Dublin Core element set as well as other popular metadata schemas and their applications. The presentation also outlines the benefits of metadata reuse and the significant role of the Metadata application profile in structuring, normalizing, disambiguating and making metadata consistent and interoperable. Additionally, it points out the significance of using controlled vocabularies and their role in disambiguating words, synonym control and consistency across collections. Introduces types of controlled vocabularies and their applications, followed by examples of some issues related to inconsistency and redundancy when applying metadata using the large-scale digitization approach.

Technologie

MetadataAn overview for digital collections
Marina Georgieva | Visiting Digital Collections Librarian
Nevada Statewide Large-scale Digitization Workshop
May 10, 2019

Overview
● Object description
● Provides intellectual access to object
● Enhances easy object retrieval
● Supports subject searching
● Supports faceting
● Promotes consistency
Significant from user perspective - users rely on metadata to discover and interact with content!
Metadata
1 | 8
facets

Granularity
● Rich vs. basic
● Collection size affects granularity
○ Large-scale projects
○ Boutique collections
○ Digital exhibits
● Material format affects granularity
○ Newspapers
○ Manuscripts
○ Photographs
2 | 8
Metadata
Metadata record: newspaper issue
Metadata record: manuscript

3 | 8
Types
● Descriptive
● Administrative
○ Technical
○ Preservation
○ Rights
● Structural
Helpful resources
Caplan, Priscilla. "Understanding premis." Washington DC, USA: Library of Congress, 2009
Miller, Steven J. Metadata for digital collections: a how-to-do-it manual. New York, NY: Neal-Schuman Publishers, 2011
Metadata

4 | 8
Reusing metadata
● Best practice for large
scale digitization
● Saves time and
resources
● Consistency among
systems
Metadata

6 | 8
Dublin Core
The Simple DC element set
1. Title
2. Creator
3. Subject
4. Description
5. Publisher
6. Contributor
7. Date
8. Type
9. Format
10. Identifier
11. Source
12. Language
13. Relation
14. Coverage
15. Rights
Metadata
http://www.dublincore.org/specifications/dublin-core/dces/

7 | 8
Controlled vocabularies
Benefits
● Improve resource discovery
● Allow grouping of similar
objects
● Support search functionality
● Support browse functionality
● Support faceting
● Promote consistency
Role
● Disambiguate concepts
● Provide synonym control
Metadata

8 | 8
Types
● Lists
● Synonym rings
● Authority files
● Thesauri
● Subject heading lists
Issues
● Inconsistency
● Redundancy
Examples
● AAT: Art and Architecture Thesaurus
● FAST: Faceted Application of Subject Terminology
● TGM: Thesaurus for Graphic Materials
● LCSH: Library of Congress Subject Headings
● MESH: Medical Subject Headings
● TGN: Getty Thesaurus of Geographic Names
● ULAN: Union List of Artist Names
Metadata

Contenu connexe

Tendances

Automated or Nothing: Large Textual Projects in Gina Strack

Kuliman "Content Profiles & linked documents"National Information Standards Organization (NISO)

Tech WG report 2011Datasets at the British Library

ReDBox and rdmps bofARDC

How to go from Adding Data to Adding Value - EASDP - David Worlock, September...David Worlock

The Data Documentation Initiative (DDI) XML Standard - Missy - Microdata Info...Dr.-Ing. Thomas Hartmann

Querying the Wikidata Knowledge GraphIoan Toma

Using Wikidata as an Authority for the SowiDataNet Research Data RepositoryJoachim Neubert

IPTC Semantic Web 2012 Spring Working GroupStuart Myles

Storage dei dati con MongoDBAndrea Balducci

Tendances (10)

Automated or Nothing: Large Textual Projects in

Kuliman "Content Profiles & linked documents"

Tech WG report 2011

ReDBox and rdmps bof

How to go from Adding Data to Adding Value - EASDP - David Worlock, September...

The Data Documentation Initiative (DDI) XML Standard - Missy - Microdata Info...

Querying the Wikidata Knowledge Graph

Using Wikidata as an Authority for the SowiDataNet Research Data Repository

IPTC Semantic Web 2012 Spring Working Group

Storage dei dati con MongoDB

Similaire à Metadata: An Overview for Digital Collections

Moving from a Locally-Developed Data Model to a Standard Conceptual ModelJenn Riley

Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...Visual Resources Association

DataCite and its DOI infrastructure - IASSIST 2013Frauke Ziedorn

Introduction to Metadata for IDAH FellowsJenn Riley

Enhancing Linnean Online presentation March 2012Richard Davis

Metadata 101Dominique Cimafranca

RDA UpdateResearch Data Alliance

Applying Digital Library Metadata StandardsJenn Riley

Linked Data and RDA: Looking at Next-Generation CatalogingJenn Riley

Data Analytics.01. Data selection and captureAlex Rayón Jerez

Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"National Information Standards Organization (NISO)

Pitts Library Digitization Initiativesjbweave

The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...Giannis Tsakonas

Kuilman Elsevier Case Study Using HTML5National Information Standards Organization (NISO)

Linked data experience at Macmillan: Building discovery services for scientif...Michele Pasin

Introduction to Metadata for IDAH FellowsJenn Riley

Open Chemistry, JupyterLab and data: Reproducible quantum chemistryMarcus Hanwell

computer fund-database presentationRakibul islam

EDI Training Module 12: An Introduction to Metadata and Data RepositoriesEnvironmental Data Initiative

EDI Training Module 6: Creating Quality MetadataEnvironmental Data Initiative

Similaire à Metadata: An Overview for Digital Collections (20)

Moving from a Locally-Developed Data Model to a Standard Conceptual Model

Creating, Curating, and Using Cultural Heritage Metadata and Resources in a L...

DataCite and its DOI infrastructure - IASSIST 2013

Introduction to Metadata for IDAH Fellows

Enhancing Linnean Online presentation March 2012

Metadata 101

RDA Update

Applying Digital Library Metadata Standards

Linked Data and RDA: Looking at Next-Generation Cataloging

Data Analytics.01. Data selection and capture

Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"

Pitts Library Digitization Initiatives

The “Nomenclature of Multidimensionality” in the Digital Libraries Evaluation...

Kuilman Elsevier Case Study Using HTML5

Linked data experience at Macmillan: Building discovery services for scientif...

Introduction to Metadata for IDAH Fellows

Open Chemistry, JupyterLab and data: Reproducible quantum chemistry

computer fund-database presentation

EDI Training Module 12: An Introduction to Metadata and Data Repositories

EDI Training Module 6: Creating Quality Metadata

Plus de Marina Georgieva

Metadata for compound objects | trainingMarina Georgieva

In-house vs. Outsourced Digitization: similarities, key differences and pitfa...Marina Georgieva

Creating websites and leading librarians to a new level of project engagementMarina Georgieva

From Temporary to Transformative: Leveraging Externally-Funded Special Collec...Marina Georgieva

Metadata Remediation: updates, procedures, workflowsMarina Georgieva

2018 Professional accomplishments in numbersMarina Georgieva

Building websites and leading librarians to a new level of project engagementMarina Georgieva

The digital librarian: the liaison between digital collections and digital pr...Marina Georgieva

Digitization revealed (2018 NLA Annual Conference)Marina Georgieva

Project Management Poster Handout for ALA Annual 2018 attendeesMarina Georgieva

Project Management Poster at ALA Annual 2018Marina Georgieva

ContentDm Landing pages for Digital CollectionsMarina Georgieva

Nevada Digital Newspaper Project at the Clark County Nevada Genealogy MeetingMarina Georgieva

Nevada Digital Newspaper Project and Chronicling America PresentationMarina Georgieva

Nevada Digital Newspaper Project | Upcoming eventsMarina Georgieva

Nevada Digital Newspaper Project | New addition to Chronicling AmericaMarina Georgieva

Large-scale digitization plan | UNLV Libraries, Dec 2017Marina Georgieva

Nevada Digital Newspaper Project | SC Division Meeting Update (Feb 2018)Marina Georgieva

Inforgraphic: facts about NDNP batchMarina Georgieva

Nevada Digital Newspaper Project Midterm StatusMarina Georgieva

Plus de Marina Georgieva (20)

Metadata for compound objects | training

In-house vs. Outsourced Digitization: similarities, key differences and pitfa...

Creating websites and leading librarians to a new level of project engagement

From Temporary to Transformative: Leveraging Externally-Funded Special Collec...

Metadata Remediation: updates, procedures, workflows

2018 Professional accomplishments in numbers

Building websites and leading librarians to a new level of project engagement

The digital librarian: the liaison between digital collections and digital pr...

Digitization revealed (2018 NLA Annual Conference)

Project Management Poster Handout for ALA Annual 2018 attendees

Project Management Poster at ALA Annual 2018

ContentDm Landing pages for Digital Collections

Nevada Digital Newspaper Project at the Clark County Nevada Genealogy Meeting

Nevada Digital Newspaper Project and Chronicling America Presentation

Nevada Digital Newspaper Project | Upcoming events

Nevada Digital Newspaper Project | New addition to Chronicling America

Large-scale digitization plan | UNLV Libraries, Dec 2017

Nevada Digital Newspaper Project | SC Division Meeting Update (Feb 2018)

Inforgraphic: facts about NDNP batch

Nevada Digital Newspaper Project Midterm Status

Dernier

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

How to convert PDF to text with Nanonetsnaman860154

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Google AI Hackathon: LLM based Evaluator for RAGSujit Pal

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

A Domino Admins Adventures (Engage 2024)Gabriella Davis

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Understanding the Laravel MVC ArchitecturePixlogix Infotech

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

Dernier (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

How to convert PDF to text with Nanonets

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Google AI Hackathon: LLM based Evaluator for RAG

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Unblocking The Main Thread Solving ANRs and Frozen Frames

A Domino Admins Adventures (Engage 2024)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

🐬 The future of MySQL is Postgres 🐘

Understanding the Laravel MVC Architecture

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

Salesforce Community Group Quito, Salesforce 101

Metadata: An Overview for Digital Collections

1. MetadataAn overview for digital collections Marina Georgieva | Visiting Digital Collections Librarian Nevada Statewide Large-scale Digitization Workshop May 10, 2019

2. Overview ● Object description ● Provides intellectual access to object ● Enhances easy object retrieval ● Supports subject searching ● Supports faceting ● Promotes consistency Significant from user perspective - users rely on metadata to discover and interact with content! Metadata 1 | 8 facets

3. Granularity ● Rich vs. basic ● Collection size affects granularity ○ Large-scale projects ○ Boutique collections ○ Digital exhibits ● Material format affects granularity ○ Newspapers ○ Manuscripts ○ Photographs 2 | 8 Metadata Metadata record: newspaper issue Metadata record: manuscript

4. 3 | 8 Types ● Descriptive ● Administrative ○ Technical ○ Preservation ○ Rights ● Structural Helpful resources Caplan, Priscilla. "Understanding premis." Washington DC, USA: Library of Congress, 2009 Miller, Steven J. Metadata for digital collections: a how-to-do-it manual. New York, NY: Neal-Schuman Publishers, 2011 Metadata

5. 4 | 8 Reusing metadata ● Best practice for large scale digitization ● Saves time and resources ● Consistency among systems Metadata

6. 5 | 8 Metadata application profile Overview ● Outlines metadata elements ● Sets encoding scheme ● Defines element occurrence ● Defines element obligation ● Collection specific vs. global Metadata Benefits ● Promotes consistency ● Enforce same description rules for all digital objects ● Provides indexing guidelines ● Provides examples ● Supports interoperability

7. 6 | 8 Dublin Core The Simple DC element set 1. Title 2. Creator 3. Subject 4. Description 5. Publisher 6. Contributor 7. Date 8. Type 9. Format 10. Identifier 11. Source 12. Language 13. Relation 14. Coverage 15. Rights Metadata http://www.dublincore.org/specifications/dublin-core/dces/

8. 7 | 8 Controlled vocabularies Benefits ● Improve resource discovery ● Allow grouping of similar objects ● Support search functionality ● Support browse functionality ● Support faceting ● Promote consistency Role ● Disambiguate concepts ● Provide synonym control Metadata

9. 8 | 8 Types ● Lists ● Synonym rings ● Authority files ● Thesauri ● Subject heading lists Issues ● Inconsistency ● Redundancy Examples ● AAT: Art and Architecture Thesaurus ● FAST: Faceted Application of Subject Terminology ● TGM: Thesaurus for Graphic Materials ● LCSH: Library of Congress Subject Headings ● MESH: Medical Subject Headings ● TGN: Getty Thesaurus of Geographic Names ● ULAN: Union List of Artist Names Metadata

Notes de l'éditeur

[MARINA] Overview So what is metadata? We hear this term in the context of digital collections. But what does it really mean? Metadata is the description we give to any object in the repository. Metadata for digital objects is like MARC records for the books in the library catalog. It describes the resource and supports easy resource discovery when users search by subject or any other searchable field (for example personal or corporate names). Moreover, metadata enables a very important website feature known as faceting – faceting are the filtering options presented as a side menu on the website. Faceting enhances the search precision as users can directly interact with the metadata by refining their search results. Finally, we should mention briefly consistency – metadata can be very powerful to promote consistency and support interoperability across collections and systems. We achieve consistency by having robust application profile, normalizing certain fields and using controlled vocabularies to describe the resources. We’ll get into details about controlled vocabularies in a moment.
[MARINA] Granularity Prior to beginning any metadata work, we need to consider the level of resource description in the collection – this is a very important decision point. Metadata can be very basic and concise or it can get as granular as we want it. The question is what resources we have (staff, legacy metadata), what’s the timeline, what’s the type of the project and what’s the format of the materials. For example, a large scale manuscript project as the Union Pacific or the Entertainment collection may be good candidates for large-scale approach, while a photographic digital exhibit with high research value may benefit from more granularity and item level approach. The material format also affects granularity. For example: newspapers need very basic metadata (usually publication related – date, volume, issue number, title), whereas graphic materials usually call for richer descriptions – subjects, names of people, corporations, places, time period. Manuscripts fall in the middle – they get richer metadata compared to newspapers but less descriptive compared to photographs. OCR’ed text of manuscripts adds more value to the metadata and enhances the keyword search in the documents.
[MARINA] Metadata comes in several flavors (see slide) We already briefly mentioned that descriptive metadata provides the intellectual access to the content of a digital collection. It’s a set of elements used to describe, catalog or index digital resources. A good resource to learn more about metadata in general and how to do it the right way is a practical manual by Steven Miller. It will give you a comprehensive overview and many practical examples, and will help you get started. Administrative metadata is a set of elements used to administer and manage digital objects and collections. This includes the name of the institution creating digital objects, date digitized, digitization equipment, masterfile name, etc. Administrative metadata has 2 subtypes: Technical and preservation metadata collects the information needed for the long term preservation of the digital object, migration to other digital formats as software and hardware changes over time. The PREMIS working group defines preservation as “the information a repository uses to support the digital preservation process”. After completion of every digitization initiative, we should think strategically to organize and preserve the digital master files long term. Digital preservation is important because digital files are fragile (they can get damaged!), technology dependent and they quickly become obsolete as technologies change rapidly (floppy disks!). Digital content requires active management to ensure ongoing accessibility. To learn more about preservation metadata and standard you can refer to the PREMIS Data Dictionary for Preservation Metadata. Our best practice is to capture important technical metadata during the digitization process. Our metadata profile has several technical metadata fields that record digitization date of the object, its format, type, size, and conversion specifications (for internal use only) - the camera information and digitization specification such as ppi and bit of the file. Currently, we are working on implementing some mandatory PREMIS fields for preservation purposes Rights metadata deals with ownership, copyright, restrictions on use and reproduction Structural metadata is used to internally structure a complex multi-page digital objects. Metadata is used to label each image and relate them as part of the same digital object, allowing users to navigate through them. We refer to these complex digital objects as parent and children.
[MARINA] Reusing metadata Occasionally, we need to start creating metadata from scratch for unprocessed collections. Usually, the best practice is to reuse metadata if it already exists in a finding aid. This is the prefered large-scale approach as it saves time and resources. So for large-scale projects we reuse metadata from technical services and normalize it to conform to the metadata standards. The large-scale digitization process focuses on a folder level and it mirrors the structure of the archival collection, so does the reused metadata - the complex digital objects get parent-level metadata only from a prioritized list of terms. Sometimes, when the goal is more granularity and if the project calls for richer metadata (such as digitization of photographs), we reuse metadata and build upon it by adding more subject terms and conducting more research on name authorities and relationships among people and institutions on an item-level. The graphic above demonstrates how reusing metadata works - in a nutshell it’s just reorganizing it and normalizing it. The bottom graphic shows how reused metadata can have added value - more subject terms, more fields (genre and location).
[MARINA] So how do we make metadata consistent across all digital objects and collections? We use Metadata application profiles! Overview The Metadata application profile is documentation designed prior to beginning digitization. It structures, normalizes, defines and disambiguates the rules for metadata creation and promotes high quality metadata - consistent, useful and interoperable. Application profiles outline the metadata element set and provide a set of rules for each of the fields - if it calls for controlled terms - what is the encoding scheme; if the field is repeatable or not; if the field is mandatory, optional or recommended. Metadata profiles can be global - to govern all digital collections in an institution or to be collection-specific. Sometimes, certain collections need more peculiar set of elements based on their format or content. In addition to the set of rules for each field and the encoding scheme, metadata profiles may also offer instructions for metadata creators how to apply these rules and provide examples as well. Sometimes institutions may have 2 separate documents - the first is the metadata profile with the element set, controlled vocabularies and rules, and a separate document that provides the indexing guidelines for metadata creators with specific instructions and examples. Yet another example is a project-specific cheat sheet that conforms to the profile, but focuses on certain fields by offering more detail and clarification (Gayle will explain). Benefits Metadata application profile is important documentation that brings consistency across collections by enforcing the same description rules in all collections. The Metadata application profiles support interoperability and are build on the metadata schema adopted in the institution. Interoperability is the ability of metadata exchange among different systems without special processing or loss of meaning. The metadata schema must be suitable for the content of the collections and to fit the existing infrastructure. You can’t adopt MODS if your system supports DC only. Institutions can also build custom metadata schema with local element set, but this practice is not recommended as it does not support interoperability. For example, if a local metadata field is mapped to standard schema as Dublin Core, in the aggregated environment the field label will not be as explicit and the metadata may not be clear or may not be mapped at all. If local elements are not mapped - they will not be displayed in the aggregated repository. If they are mapped from very specific local fields, the data may be unclear to users. For example local field “Neon sign type” can be mapped to Dublin Core subject field, but the label in the aggregated repository will be ‘subject’, and the values will not be as explicit, especially if they come from local controlled vocabulary. Or as you see on the example here on the slide 3 different local fields map to the DC spatial coverage and the values in these fields may be unclear.
[MARINA] Dublin Core I mentioned Dublin Core as one of the metadata standards. It is a small set used to describe digital resources.The Simple DC has 15 elements. Each of them are optional and repeatable. There’s no specific order how to present or use the element set. The way institutions use Dublin Core is to get some or all of the elements, and sometimes may combine them with local descriptive elements, with technical metadata fields. This is how the Metadata Application Profile is created. In an aggregated repository where multiple institutions share their data, all significant fields have to be mapped to DC so the values can be harvested and displayed.
[MARINA] Controlled vocabularies help users retrieve sets of meaningfully related resources rather than single random resources on a topic. Controlled vocabularies allow grouping of digital objects that share significant similar characteristics. They are fundamental for the search and browse functionality. These functions work well only if metadata creators enter consistent, standardized values in the metadata fields. Role One of the main goals of controlled vocabularies is to disambiguate the meaning of a word - for instance, some words refer to different concepts, places or people. For example, ‘bank’ can be a financial institution or a container; ‘mercury’ can be metal or the planet. Controlled vocabularies help people distinguish between the different meanings. The other important role is synonym control. Many natural language words refer to the same thing or concept. Controlled vocabularies identify and explicitly link the preferred terms used for metadata, while the synonym terms are used for cross reference.
[MARINA] Types Controlled vocabularies come in several types, but here we’ll focus on the types most commonly used in digital collections - thesauri, authority files and subject heading lists. For the sake of consistency and interoperability the use of established controlled vocabularies is recommended, although in some cases the content can be very specific and require a local vocabulary. Here we’ve listed some commonly used vocabularies. Some of them are linked data ready and provide persistent identifiers for each term. Issues Some of the issues that occur with controlled vocabularies are inconsistency and redundancy. Inconsistency happens in the context of metadata creation - the people who describe the objects (especially if it’s graphical) are humans and may be subjective when selecting terms. Although the terms come from controlled authority file, this subjectivity leads to inconsistencies. Redundancy happens again in the context of metadata creation - the metadata creators can select from a list of controlled terms and sometimes they select synonymous terms to describe the same object which is unnecessary and may lead to retrieval problems. For example the TGM vocabulary has multiple terms for photos: “photographs”, “photographic prints”, “color photographs”, “Black & white photographs”. Best practice is to decide which term is best for a specific resource type and stick to it rather than using multiple synonymous terms.

Metadata: An Overview for Digital Collections

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (10)

Similaire à Metadata: An Overview for Digital Collections

Similaire à Metadata: An Overview for Digital Collections (20)

Plus de Marina Georgieva

Plus de Marina Georgieva (20)

Dernier

Dernier (20)

Metadata: An Overview for Digital Collections

Notes de l'éditeur