SlideShare une entreprise Scribd logo
1  sur  24
ALL TEH METADATAS
   Re-revisited
  2013 code{4}lib Meeting
    February 13, 2013
         Esmé Cowles
      Matthew Critchlow
      Bradley Westbrook
Overview
       Overview
• Needs assessment and proposed solution
       • Needs Assessment
       • Data
• Data modeling   Model Process
        • Implementation
• Tool implementation
Overview

• Needs assessment and proposed solution

•
            Needs Assessment
    Data modeling
                  Brad Westbrook
• Tool implementation
Need One: More consistent data
Need Two: Maintain syntax of hierarchical
               subjects
Need Three: Improve support for complex
                objects
Improve support for complex objects-2
Need Four: Align more strongly with DL
                   community

• Make sure UCSD RDF is public facing
   – Use vocabularies in the public
   – Make UCSD vocabularies public


• Develop technology stack
   – Utilize contributions from non-UCSD sources
   – Contribute to non-UCSD endeavors
Data Model Process
   Matt Critchlow
Project Overview

Research Data Curation Pilot Deadline: June, 2013

Timeline: July 16, 2012 – Oct 29, 2012

Deliverables
• Abstract Data Model
• OWL/RDF Ontology
• Data Model Extension Guidelines

Team
Metadata Analyst: Arwen Hutt, Bradley Westbrook
IT: Esmé Cowles, Matt Critchlow, Longshou Situ
User Stories


As an administrative unit manager, I want to indicate any
external versions or descriptions of an object that may be of
probable importance to a user

As a user, I want to know what collection(s) an object belongs to

As a DAMS manager, I want to know what administrative unit an
object belongs to
Abstract Model – High Level


               Related
Collection               Collection     Unit
              Resource



                                       Related
Component    Component    Object                 Object
                                      Resource



   File      Component   Component                File



                            File
Abstract Model
                                            DAMS
                                            Event
                                                                      Related
                                                                     Resource
                                            DAMS                                                                   Provenance    Provenance
 Language                                                                                           Collection                    Collection
                                           Resource                                                                 Collection      Part


               Collection
    Title               Subject             Name      Relationship     Date       Note               Object              Unit

                                                                                                                 Name
                         Carto-                                                                                                  Source
                                                                                                                         File
Vocabulary
                     Object
                        graphics
                                                              Relationship                          Component
                                                                                                                                 Capture



Vocabulary
  Entry
                            Role                                                           Rights                 Role
             Component                                               Copyright   License             Statute
                                                                                                                     Other
                                                                                                                     Rights


DAMS Data Model                                                                                        Rights
Entity-Relationship Diagram - 2013-02-07                                                               Action
            class linking
            class inheritance
                                                                                         Restriction        Permission
Data Dictionary


       Title (title 1-m)

   Copyright (copyright 1)

  Language (language 1-m)

 Administrative Unit (unit 1)

Relationship (relationship 0-m)
Ontology
Thing 1, Thing 2
Thing 1, Thing 2

<dams:Title>                                           Shared Entites                   <dams:Date>
"Adm. Nimitz Joins                                                                      <dams:type> "creation"
Red Tape Cutters"                              <mads:Topic>                             "1940"
                                               "Midway, Battle of, 1942"
                                               <owl:sameAs> <lcsh:sh85085053>




                                                                                                   dams:date
                                dams:subject
           dams:title




                                               <mads:Topic>                      dams:
                                               "Commercial art--United           subject
                                               States"
                                               <owl:sameAs> <lcsh:sh85028914>
<dams:Object>                                                                           <dams:Object>
bb1809708v                                                                              bb16050824

                               dams: <mads:PersonalName>                        dams:
           dams:relationship




                                                                                                   dams:relationship
                               name "Geisel, Theodor Seuss,                     name
                                               1904-1991"
                                               <owl:sameAs> <lcnaf:n91084846>



                                               <dams:Role>
                                               "creator"
                                dams:




<dams:Relationship>                                                             dams:   <dams:Relationship>
                                 role




                                                                                 role
                                               <owl:sameAs> <marcrel:cre>
Implementation
  Esmé Cowles
DAMS Repository


• New version of our lightweight repository
  – Metadata in triplestore
  – Files on disk or cloud storage
• Explicit structural metadata
• Native REST API
• Fedora REST API (partial)
DAMS Manager


•   Separate Java webapp
•   Ingest, batch operations
•   Uses DAMS Repository REST API
•   Functionality moved into the repository
    – Characterization (JHove)
    – Fixity checking
    – Derivatives (ImageMagick)
DAMS Public Access System


• Old frontend is unsustainable
• New frontend in Hydra
  – Backed by DAMS Repo, not Fedora
• Hydra platform and community
Timeline


•   Started 2 months ago
•   Code sprint in January with cbeer and jcoyne
•   March: Beta release with research data
•   Spring: Migrating existing content
•   Summer: Production release
One More Thing


• We’ve talked about DAMS for years...
• Now we have code to share


 http://github.com/ucsdlib/
        @escowles @mattcritchlow
         bdwestbrook@ucsd.edu

Contenu connexe

En vedette

Dpla chicago
Dpla chicagoDpla chicago
Dpla chicagoNate Hill
 
Chris Oliver: RDA: Designed for Current and Future Environments
Chris Oliver: RDA: Designed for Current and Future EnvironmentsChris Oliver: RDA: Designed for Current and Future Environments
Chris Oliver: RDA: Designed for Current and Future EnvironmentsALATechSource
 
The Evolution of the UC San Diego Library DAMS
The Evolution of the  UC San Diego Library DAMSThe Evolution of the  UC San Diego Library DAMS
The Evolution of the UC San Diego Library DAMSMatthew Critchlow
 
UC San Diego Campus LISA 2014 - Source Code Management
UC San Diego Campus LISA 2014 - Source Code ManagementUC San Diego Campus LISA 2014 - Source Code Management
UC San Diego Campus LISA 2014 - Source Code ManagementMatthew Critchlow
 
Libraries in the Gigabit World
Libraries in the Gigabit WorldLibraries in the Gigabit World
Libraries in the Gigabit WorldNate Hill
 
Technology & Archives: Exchange Forum Programmer & Archivist Collaboration
Technology & Archives: Exchange Forum Programmer & Archivist CollaborationTechnology & Archives: Exchange Forum Programmer & Archivist Collaboration
Technology & Archives: Exchange Forum Programmer & Archivist CollaborationMatthew Critchlow
 
CfA-summit-natehill
CfA-summit-natehillCfA-summit-natehill
CfA-summit-natehillNate Hill
 

En vedette (7)

Dpla chicago
Dpla chicagoDpla chicago
Dpla chicago
 
Chris Oliver: RDA: Designed for Current and Future Environments
Chris Oliver: RDA: Designed for Current and Future EnvironmentsChris Oliver: RDA: Designed for Current and Future Environments
Chris Oliver: RDA: Designed for Current and Future Environments
 
The Evolution of the UC San Diego Library DAMS
The Evolution of the  UC San Diego Library DAMSThe Evolution of the  UC San Diego Library DAMS
The Evolution of the UC San Diego Library DAMS
 
UC San Diego Campus LISA 2014 - Source Code Management
UC San Diego Campus LISA 2014 - Source Code ManagementUC San Diego Campus LISA 2014 - Source Code Management
UC San Diego Campus LISA 2014 - Source Code Management
 
Libraries in the Gigabit World
Libraries in the Gigabit WorldLibraries in the Gigabit World
Libraries in the Gigabit World
 
Technology & Archives: Exchange Forum Programmer & Archivist Collaboration
Technology & Archives: Exchange Forum Programmer & Archivist CollaborationTechnology & Archives: Exchange Forum Programmer & Archivist Collaboration
Technology & Archives: Exchange Forum Programmer & Archivist Collaboration
 
CfA-summit-natehill
CfA-summit-natehillCfA-summit-natehill
CfA-summit-natehill
 

Similaire à Code4Lib 2013 - All THE Metadatas Re-Revisited

Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...Valentine Charles
 
[DCTPE2010] Biodiversity & Drupal
[DCTPE2010] Biodiversity & Drupal[DCTPE2010] Biodiversity & Drupal
[DCTPE2010] Biodiversity & DrupalDrupal Taiwan
 
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Laura Akerman
 
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)Alexandre Monnin
 
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperabilityMapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperabilityGordon Dunsire
 
Part2- The Atomic Information Resource
Part2- The Atomic Information ResourcePart2- The Atomic Information Resource
Part2- The Atomic Information ResourceJEAN-MICHEL LETENNIER
 
RDFa Semantic Web
RDFa Semantic WebRDFa Semantic Web
RDFa Semantic WebRob Paok
 
The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0Robert Lemke
 
The Mysteries of Metadata
The Mysteries of MetadataThe Mysteries of Metadata
The Mysteries of MetadataAmit Sheth
 
Historical Evolution of RDBMS
Historical Evolution of RDBMSHistorical Evolution of RDBMS
Historical Evolution of RDBMSShailesh Pachori
 
The JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeThe JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeEduserv Foundation
 
Example handouts from MW2011 Extensis DAM Forum
Example handouts from MW2011 Extensis DAM ForumExample handouts from MW2011 Extensis DAM Forum
Example handouts from MW2011 Extensis DAM ForumExtensis
 
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)Emil Eifrem
 
Modular applications with montage components
Modular applications with montage componentsModular applications with montage components
Modular applications with montage componentsBenoit Marchant
 

Similaire à Code4Lib 2013 - All THE Metadatas Re-Revisited (20)

Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
 
Jpl presentation
Jpl presentationJpl presentation
Jpl presentation
 
Jpl presentation
Jpl presentationJpl presentation
Jpl presentation
 
Jpl presentation
Jpl presentationJpl presentation
Jpl presentation
 
[DCTPE2010] Biodiversity & Drupal
[DCTPE2010] Biodiversity & Drupal[DCTPE2010] Biodiversity & Drupal
[DCTPE2010] Biodiversity & Drupal
 
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
 
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
Speech acts meet tagging: NiceTag ontology (Pragmatic Web)
 
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperabilityMapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
 
Part2- The Atomic Information Resource
Part2- The Atomic Information ResourcePart2- The Atomic Information Resource
Part2- The Atomic Information Resource
 
RDFa Semantic Web
RDFa Semantic WebRDFa Semantic Web
RDFa Semantic Web
 
Semantic Digital Libraries
Semantic Digital LibrariesSemantic Digital Libraries
Semantic Digital Libraries
 
Semantic web
Semantic web Semantic web
Semantic web
 
The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0The current architecture of TYPO3 5.0
The current architecture of TYPO3 5.0
 
The Mysteries of Metadata
The Mysteries of MetadataThe Mysteries of Metadata
The Mysteries of Metadata
 
Historical Evolution of RDBMS
Historical Evolution of RDBMSHistorical Evolution of RDBMS
Historical Evolution of RDBMS
 
The JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scopeThe JISC DC Application Profiles: Some thoughts on requirements and scope
The JISC DC Application Profiles: Some thoughts on requirements and scope
 
Example handouts from MW2011 Extensis DAM Forum
Example handouts from MW2011 Extensis DAM ForumExample handouts from MW2011 Extensis DAM Forum
Example handouts from MW2011 Extensis DAM Forum
 
No Sql
No SqlNo Sql
No Sql
 
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
NOSQL Overview, Neo4j Intro And Production Example (QCon London 2010)
 
Modular applications with montage components
Modular applications with montage componentsModular applications with montage components
Modular applications with montage components
 

Dernier

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Dernier (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Code4Lib 2013 - All THE Metadatas Re-Revisited

  • 1. ALL TEH METADATAS Re-revisited 2013 code{4}lib Meeting February 13, 2013 Esmé Cowles Matthew Critchlow Bradley Westbrook
  • 2. Overview Overview • Needs assessment and proposed solution • Needs Assessment • Data • Data modeling Model Process • Implementation • Tool implementation
  • 3. Overview • Needs assessment and proposed solution • Needs Assessment Data modeling Brad Westbrook • Tool implementation
  • 4. Need One: More consistent data
  • 5. Need Two: Maintain syntax of hierarchical subjects
  • 6. Need Three: Improve support for complex objects
  • 7. Improve support for complex objects-2
  • 8. Need Four: Align more strongly with DL community • Make sure UCSD RDF is public facing – Use vocabularies in the public – Make UCSD vocabularies public • Develop technology stack – Utilize contributions from non-UCSD sources – Contribute to non-UCSD endeavors
  • 9. Data Model Process Matt Critchlow
  • 10. Project Overview Research Data Curation Pilot Deadline: June, 2013 Timeline: July 16, 2012 – Oct 29, 2012 Deliverables • Abstract Data Model • OWL/RDF Ontology • Data Model Extension Guidelines Team Metadata Analyst: Arwen Hutt, Bradley Westbrook IT: Esmé Cowles, Matt Critchlow, Longshou Situ
  • 11. User Stories As an administrative unit manager, I want to indicate any external versions or descriptions of an object that may be of probable importance to a user As a user, I want to know what collection(s) an object belongs to As a DAMS manager, I want to know what administrative unit an object belongs to
  • 12. Abstract Model – High Level Related Collection Collection Unit Resource Related Component Component Object Object Resource File Component Component File File
  • 13. Abstract Model DAMS Event Related Resource DAMS Provenance Provenance Language Collection Collection Resource Collection Part Collection Title Subject Name Relationship Date Note Object Unit Name Carto- Source File Vocabulary Object graphics Relationship Component Capture Vocabulary Entry Role Rights Role Component Copyright License Statute Other Rights DAMS Data Model Rights Entity-Relationship Diagram - 2013-02-07 Action class linking class inheritance Restriction Permission
  • 14. Data Dictionary Title (title 1-m) Copyright (copyright 1) Language (language 1-m) Administrative Unit (unit 1) Relationship (relationship 0-m)
  • 17. Thing 1, Thing 2 <dams:Title> Shared Entites <dams:Date> "Adm. Nimitz Joins <dams:type> "creation" Red Tape Cutters" <mads:Topic> "1940" "Midway, Battle of, 1942" <owl:sameAs> <lcsh:sh85085053> dams:date dams:subject dams:title <mads:Topic> dams: "Commercial art--United subject States" <owl:sameAs> <lcsh:sh85028914> <dams:Object> <dams:Object> bb1809708v bb16050824 dams: <mads:PersonalName> dams: dams:relationship dams:relationship name "Geisel, Theodor Seuss, name 1904-1991" <owl:sameAs> <lcnaf:n91084846> <dams:Role> "creator" dams: <dams:Relationship> dams: <dams:Relationship> role role <owl:sameAs> <marcrel:cre>
  • 19.
  • 20. DAMS Repository • New version of our lightweight repository – Metadata in triplestore – Files on disk or cloud storage • Explicit structural metadata • Native REST API • Fedora REST API (partial)
  • 21. DAMS Manager • Separate Java webapp • Ingest, batch operations • Uses DAMS Repository REST API • Functionality moved into the repository – Characterization (JHove) – Fixity checking – Derivatives (ImageMagick)
  • 22. DAMS Public Access System • Old frontend is unsustainable • New frontend in Hydra – Backed by DAMS Repo, not Fedora • Hydra platform and community
  • 23. Timeline • Started 2 months ago • Code sprint in January with cbeer and jcoyne • March: Beta release with research data • Spring: Migrating existing content • Summer: Production release
  • 24. One More Thing • We’ve talked about DAMS for years... • Now we have code to share http://github.com/ucsdlib/ @escowles @mattcritchlow bdwestbrook@ucsd.edu

Notes de l'éditeur

  1. Hi. I am Brad Westbrook. This is Matt Critchlow, and this is Esme Cowles. We would like to thank you all for selecting us to present this morning. We are going to tell you part of a continuing story. Declan Fleming told an earlier part of this story last year, and that part concerned how UCSD had implemented RDF for its Digital Asset Management System beginning in 2004 and how it hopes to move its DAMS into the linked data environment. In our part of the story, which is still more toward the beginning than the end, we are going to describe some of the barriers we found standing between the UCSD DAMS and the linked data terminus envisioned in Declan’s previous talk and what measures we are taking to get past those barriers.
  2. I am going to briefly describe four key areas that we decided needed to be addressed to improve the functionality and sustainability of the UCSD DAMS.
  3. The most basic need we have is to improve the consistency of the metadata in the DAMS. ANIMATION ONEWe acquire highly variable content files and metadata. **** Metadata may come in the form of comma separated values, Excel spreadsheets, MODS exports from the Archivists’ Toolkit, and MARCXML generated from our ILS. Some of the metadata we acquire was created according to established content and format standards. Other metadata has no standard behind it. Some metadata is very thorough, including titles, statements of responsibility, dates of creation or issuance, notes, and even subject and name headings. Other metadata can be very scanty and has to be supplemented as best as possible during the ingest process. ANIMATION TWO**** We normalize content files to supported file formats through the file ingest process, and we normalize acquired descriptive data through what we call the assembly plan process. As Declan described in his presentation last year, UCSD’s assembly plan process is a specification for stamping out object records for a class of objects, the class usually being defined by provenance. The assembly plan is engineered to insure that all objects in the DAMS have metadata elements and formats necessary to support rudimentary object interoperability within the DAMS and, moreover, that a class of objects is described similarly.The assembly plan as it is handed to ETL staff is expressed in XML and references the MODS, PREMIS, MIX, and METS schema. ETL staff, of which we have four, then use XSLT to transform the XML into RDF. ANIMATION THREESince the start of our work in 2005, the transformation from XML to RDF has been highly artisanal. **** Each ETL staff has created her or his own stylesheet for producing the transformations. Unfortunately, during this time there was no explicit data model to help control the stylesheets for uniformity. Consequently, uniformity among objects established in the assembly plan process is erased in the RDF transformations. These differences impact object interoperability, UI display, and, of course, user experience.
  4. The second need we identified was to discover a way to maintain the syntax of hierarchical name and subject headings inherited from MARC records. MARC records are the source for approximately 70% of the DAMS object records. As this illustration indicates, the syntax of subject headings from a MARC record is corrupted in the RDF transformations. In this particular case, only one subject heading of the eight in the source MARC record is correctly transformed to RDF. All the rest are scrambled, with the primary term being demoted one or more levels in the hierarchy. One subject heading, in fact, acquires information not present in the source record. Needless to say, this problem also impacts searching and interpretation of the digital assets in the DAMS. It also reflects poorly on the Library’s metadata staff.
  5. A third area we found needing improvement is our presentation of the many complex objects in the DAMS. A complex object is a multi-part object with components and metadata at component levels. The components may be few or many.The complex object record includes two structure maps. One is a physical map that simply represents the sequence of the files comprising the object. The other is a logical map that correlates components with any descriptive or rights metadata they might have. The current DAMS digital object viewer provides only the physical file map. In this example it is a scrollable list of 24 components. The audio file reflected in the viewing pane is the file for the first of the 24 components. But there is no way within the digital object viewer for a user to know that the audio file contains a discussion by three musicians about another musician’s technique.
  6. But that metadata is in the object record, as this XML representation of the second component indicates. The descriptive metadata includes a title and names of presenters, as well as indication of what exact day of the multi-day conference the presentation occurred. This is all important information for navigating the 24 parts of the object. Because of this limitation, patrons and reference librarians have found the digital object viewer very difficult, in fact nearly impossible, to use for navigating multi-part objects. They can move through the nodes but they have no idea what a node contains until they activate the file. To help patrons understand and navigate complex objects, our public service librarians have often had to resort to using the Archivists’ Toolkit, which is only deployed at UCSD as a staff side authoring tool for complex digital object metadata. This extra step is very inconvenient to patron and staff and leads to considerable frustration.
  7. The fourth area of need is to align our DAMS more strongly with the digital library community. We want to do that on a Micro-level by making sure our RDF is outward facing so we can utilize public vocabularies and also make our own vocabularies available for use by others.And we want to do it on a Macro-level by making sure we are developing a technology stack that can be shared with others and that others can contribute to, should they like. Our past methods have been tightly insular and, were they continued, would force us to make every vocabulary and technical advance by ourselves. We see strengthening our alignment with the community as the most important pathway toward a cost-effective, generally-purposed, and sustainable digital asset management service for UC San Diego and for others. These needs, especially the desire for stronger community alignment, triggered data modeling and implementation work over the last 10 months that Matt and Esme will now describe.ADVANCE TO NEXT SLIDE
  8. In addition to the needs Brad just described, we found ourselves facing a very real hard deadline for our Research Data Curation Pilot Program.This deadline is what ultimately crystalized the need to prioritize this project.our Digital Library steering committee gave us a:Timeline of 3 monthsAsked to provide:- abstract data model- An OWL/RDF ontology. - Documentation for extending and modifying the data model over timeTeambalance of metadata analysts and it development staff.Address the previous gaps in understanding and communication that Brad described. Nothing lost in translation in this project / going forwardCross Section of RCI and the DLP (critical). The scope of our DAMS system has grown, and is not only internal library collections anymore.
  9. So with the goal of representing the needs of both the Library and the research community on camps,our team decided to generate a user stories document for the data model projectFocusing here on the Objects, you can see various roles called out like administrative unit manager, DAMS manager and end user-After an initial pass at these user stories, we moved on to creating the requested deliverables themselves.
  10. The first phase of the project involved the creation of an abstract data model as our domain definitionwe began with a high level entity relationship diagram, calling out the core base classes:Collection, Object, Component, and FileAnd the various relationships between themIf we had stopped here, we probably would have finished in a few weeks.But it would have also been a pretty useless data model
  11. The next step was to create a flushed out abstract modelThe diagram has those same core classes of from the previous slideBut we have added modeling for Rights, events and descriptive metadata, and their relationships and inheritance structureOur current data model, as Declan talked about last year, is very flexible in the definition, re-use, and modification of predicates for our RDF triplesWhich gives us the ability to make global changes with relative ease. and this capability will be carried forward into the new systemBut the RDF triples are almost all stored inline with each object. And that’s something we wanted to change.Take Names for example, particularly the Name/Role Relationship.The main UCSD Library building, shown on our first slide, is called the Geisel Library. It’s named after Theodore Geisel who we all know and love as Dr. Suess.As a result his generous donations, we have a number of Dr. Seuss collections in our DAMS.Many of these objects each have a set of RDF triples of a the name Theodore Geisel and the Role, creator. We wanted to move towards a linked data approach where the Geisel name could stand on its own an object with an ARK, our permanent UID, that can also include external authority references like LOC SH and NAFAnd then, that name object could be referenced by collections, objects, components as needed.So we created our abstract model with this goal in mind.The next step was mapping this ERD diagram into a Data Dictionary
  12. The data dictionary is represented as an enormous table of class hierarchy, properties, controlled vocabulary references/lists, constraints and notesJust a couple comments about it:While our objects can certainly be very richly described, as you’ve seen on previous slides already, we wanted to allow for a wide range of description SO in our new data model, an object only needs to have a title, a copyright statement, a language, be associated with a Administrative UnitNote that: Object doesn’t have to have a fileThis was a relatively recent requirement that surfaced in our research data projects, and one we wanted to call out and support.Traditionally, we have always required an object to have a file, as part of its definition.2. This document was the transitioning point between the abstract modeling and creating an ontology.As a result this is where we spent a lot of our time in the projectIt was at this point that we began looking closely at standards we were familiar with, such as MODS, PREMIS, MIX, Dublin CoreAnd then began looking for ontologies built on these standards we could use for our data modelContinuing with our Relationship example, we defined that an Object can have 0 to many Relationships as Name/Role pairs.The question we had to start asking ourselves then were:How to model the Names, Roles, and relationships from an ontology we know of?ORare there other ontologies we should be considering instead?ORIs there nothing that directly covers a particular need, and we need to, at least temporarily, model it ourselves?These were the kind of questions we needed answers to as we started creating our Ontology in Protégé
  13. With Names, we were fortunate to find our answer in the MADS ontology.MADS stands for Metadata Authority Description Schema and is available on the library of congress website.It has a specification for Names, as well as Subjects, that aligned really well with our needsSo we were able to define in our ontology that a Relationship consists of:one MADS name type (Conference, Corporate, Family or Personal Name) At least one Role. Our Roles controlled vocabulary is the MARC relator codes. Also available on the LOC website.With Rights metadata we were less fortunate. We found the PREMIS draft ontology, read through it, and pulled it into Protege to review it.but it is still in draft form, and we need a working solution now.Not just because of our deadline, but also because our rights metadata drives our access model in the DAMS.So we found ourselves striking a balance between leveraging a community standard ontology like MADS, and creating a PREMIS-like rights metadata implementation locallyTo be clear, our intention is that when the PREMIS ontology is in a production state we will adopt it in place of our local implementation.These are just two examples, but they’re very indicative of how the ontology creation process went for usI want to close before I hand things off to Esme with a little more Dr Seuss
  14. Two images created by Dr. Seuss that are from two different collections in our DAMS.L: Dr. Seuss Went to War CollectionR: Dr. Seuss Advertising Artwork CollectionEach have metadata properties that makes them distinct, but they also reference entities than can be sharedgoing back to goal of a linked data solution, we can now define the following in the new data model -&gt;
  15. In this diagramSpecific properties to each objects. Title on left, Date on rightShared entities that can be referenced by any object in DAMSEach have a mads:Topic with a LOC SHEach share the same relationshipMADS Personal Name - Dr. SeussRole - CreatorSo in the span of a few months, we ended up with a new data model that we’re pretty proud ofEsme is going to catch you up on what we’ve been working on since November-&gt;
  16. Implementation phase includes people from our digital library program, research data curation program, metadata analysts, and developers.
  17. Our data model got a little more complex, so we knew we’d need to make some pretty substantial changes.
  18. New version of our repository to handle new data model.Same architecture: triples and files. Cloud storage.Big change is consolidating our servlets into a coherent REST APIPartial clone of the Fedora REST API.
  19. Some changes to our administrative app (ingest, batch).Big change is using the repository REST APIUsed to talk to files and triples directlyFunctionality moved into Repo, triggered by REST API calls.File characterization with JHoveFixity checkingDerivative generation
  20. Biggest changes to frontend.Our current frontend is a heavy, client-sideJavascript application.Have to maintain everything ourselves, all data is retrieved using AJAX so search engines don’t index our content, generally unsustainable.Chose Hydra because of the platform and the community.New to Rails and really liking it.Really vibrant community.
  21. We’ve gone from nothing to having a rough but working app in about 2 months.Code sprint with Chris Beer and Justin Coyne really helped get us within striking distance of working system.Beta release limited to research data in two weeks.On track for migrating this spring, and release this summer.
  22. We’ve been coming to c4l since the beginning and taking about DAMS.Now I’m very happy we have actual code to share.