Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Yann Nicolas - Elag 2018 : From XML to MARC

78 vues

Publié le

FROM XML TO MARC
RDF BEHIND THE SCENES
Yann NICOLAS / Abes
Elag 2018 - Prague

Publié dans : Formation
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Yann Nicolas - Elag 2018 : From XML to MARC

  1. 1. RDF BEHIND THE SCENES FROM XML TO MARC www.abes.fr Yann NICOLAS @071625348
  2. 2. ABES = French national agency for HE libraries www.sudoc.abes.fr General union catalog www.idref.fr Authority data www.theses.fr Dissertations (completed or in progress) www.calames.abes.fr Archives and manuscripts
  3. 3. marc marc marc RDF A B C X Y Z Publishermetadata Catalogues, LOD,discotools… Catalogues
  4. 4. RDF XML • ONIX • JATS • Etc. KBART Etc. MARC LOD Etc. SHARINGNORMALIZINGCOLLECTING Not yet #grrr
  5. 5. RDF XML • ONIX • JATS • Etc. KBART Etc. MARC LOD Etc. ENRICHING Other bib. Data + reference data (authority data…)
  6. 6. WHAT WE DO Collecting, Normalizing, Enriching, Sharing WHAT FOR Contributing to the web of open, linked and high quality data of scholarly content ON WHICH STUFF France related metadata sets + What is not (yet) in the LOD
  7. 7. We need to be flexible. Data model Workflows Technological choices
  8. 8. ISBD RDA VCARD VIVO SKOS FOAF DCTERMS BIBLIO ontology RDF VOCABULARIES + AD HOC CLASSES & PROPERTIES EXTENSIBLE DATA MODEL …
  9. 9. FLEXIBLE WORKFLOWS
  10. 10. Basic operations on data > SQL & SPARQL Queries + Stored procedures Workflow engine > Stored procedures in a relational database Administration interface for data experts >
  11. 11. TECHNOLOGICAL CHOICES
  12. 12. WHY ORACLE ? • Need to program close to the data (for performance reasons) • PL SQL • Java inside Oracle • Our dependencies • Not being dependent on a specific RDF DB  no Virtuoso PL SQL • We are already dependent on ORACLE… (and have the skills) • Joining with more data already in Oracle • Mix of SPARQL and SQL (or XQUERY) • Management of errors  logs in tables, easy to query
  13. 13. 4 use cases
  14. 14. print isbn elec. isbn title date url <OUP_ONIX/> about each print book print manifestation electronic manifestation work print isbn is the key to merge the two graphs MARC for print MARC for ebook XML RDF MARC KBART about ebooks From SQL to RDF www.bacon.abes.fr
  15. 15. <XML/> about each document Person with local ID Person with IdRef ID Work IdRef = authority database for French HE XML RDF VIAF Dump of links  RDF NKC DnB VIAF Link computed by Qualinca (home made algos)
  16. 16. <XML/> about each Nature article Nature Subject Article F-MeSH is a subset of IdRef XML RDF www.idref.fr MeSH Subject French MeSH Subject
  17. 17. Springer Concept LCSH Concept <SPRINGER_XML/> about each chapter Springer Concept LCSH Concept Chapter Springer Concept ID is the key to merge the two graphs XML RDF Via Open Refine RDF export skos:closeMatch Mapping
  18. 18. Some concerns…
  19. 19. RDF Publishers data MARC Live shared cataloguing in Sudoc AS CATALOGERS ARE CATALOGING, SHOULD WE FEED BACK THE RDF BLACK BOX ? LOD
  20. 20. RDF XML • ONIX • JATS • Etc. MARC LOD Etc. RDF/XML HUMAN and MACHINE BOTTLENECKS NTRIPLES XSLT JENA SPARQL INSERT via JDBC
  21. 21. RDF MARC LOD Etc. Publishermetadata OK… BUT WHICH PUBLISHER DATA PROCESSING ? ? NEED FOR A GLOBAL COLLABORATION !  NEED FOR A GLOBAL @TODO LIST ! ?

×