Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

LKG Editor Dev

258 vues

Publié le

A presentation of LD4P2 work on linked data editor development and lookup services in particular. Presented at US2TS, 11-13 March, 2019

Publié dans : Technologie
  • Soyez le premier à commenter

LKG Editor Dev

  1. 1. Library Knowledge Graph Editor Development Simeon Warner (Cornell) https://orcid.org/0000-0002-7970-7855 Reporting work from the LD4P2 project including contributions from: Steven Folsom, Huda Khan, Lynette Rayle, Jason Kovari, Tim Worrall (Cornell), Astrid Usong (Stanford), David Eichmann (Iowa), and others… US2TS 2019, March 11-13, Duke University, Durham, NC
  2. 2. Library Knowledge Graph ~ Library Catalog #1 - Facilitate discovery of resources (find, identify, select, obtain) #2 - Facilitate management of resources
  3. 3. Library Cataloging Background Many practices developed in the era of card catalogs MARC format developed in 1960's Long history of linking entities, albeit with authorized names rather than identifiers. Used for limited forms of semantic browse LD4 work and broader community moving from MARC→RDF, from authorized names to URIs, and toward better linking with the web Henriette Avram 1919–2006, American computer programmer and systems analyst who developed MARC https://en.wikipedia.org/wiki/Henrie tte_Avram
  4. 4. Production Scale Cornell catalog has ~9M records (~8M physical, ~1M electronic) Cataloging staff must keep up with new acquisitions. RSI is a real Rarely start from scratch: base on vendor supplied, community records or record for similar resource Specialists covering many languages Library Technical Services space in OIin Library, Cornell University
  5. 5. MARC → RDF Past work on ontology development but current focus around BIBFRAME model from Library of Congress (LC), still evolving Conversions ~100 triples from each MARC record Cornell: 9M records → ~1 billion triples (cf. WorldCat scale: 440M bib records, 2.7G holdings) Community will still rely on centralized services, but opens possibility for other models too, and ad-hoc links Key entity types in BIBFRAME
  6. 6. Shapes cf. Khan, Folsom, et al., poster at US2TS 2018 Want re-use and hence interested in shared shapes. Mechanics may be mix of SHACL, ShEx, schema Currently no decoupling of validation from forms, a controlled environment https://drive.google.com/file/d/1M_xhnG8qYL7M9akvIRSETfOgeSEfS9oh/view
  7. 7. Linking Our Data - Focus on Lookups Build UI and infrastructure around discovery of related entities. We know: ➔ Evolving community norms: appetite for a variety of linked datasets and associated lookup services; how to link each well and efficiently; sensitivity to inclusive descriptions ➔ Complexity in how to search (recall/precision -- relevancy tests) ➔ Need context -- labels and types are nowhere near sufficient, what else to display to enable human verification/selection? ➔ Multiple sources for same entity type (e.g. person in LC NAF, ISNI, ORCID) ➔ If available, hubs likely most efficient ➔ Largely untackled: maintenance and updates (traditional authorities have strong policies and practices which have benefit but can be stifling)
  8. 8. Lookup Usability Experiments ● Building on VitroLib designs and results ○ Context generally useful and navigation to authoritative sources important ● Current LD4P2 usability work around Sinopia editor development ○ 6 participants across different institutions ○ Prototype based on LC BIBFRAME Editor (BFE) ○ Contextual information for persons and genre forms ○ Links to Wikipedia, ISNI, VIAF where available ○ Additional mockups Slides from SWIB18 presentation; Folsom, Khan, et al.
  9. 9. A cataloger has a copy of a film "Nowhere Boy" by "Sam Taylor", a British director
  10. 10. A cataloger is trying to add genre to a record, is "humorous" fiction the right term?
  11. 11. Lookup Usability: Preliminary Results ● Contextual information useful ○ Should also include related works, more identifying info ○ Identify source of information ● External sources such as university profiles, genre or type-specific sites (e.g. Discogs) ● Vocabularies such as MESH, AAT, Getty (depending on content) ● Links to Wikidata, ISNI, VIAF are useful to include ● Need consistent interface experience, use clearer icons ● Improve hierarchical navigation for subject areas/genre forms
  12. 12. Work Cycle I Data Flow Diagrams and Prototypes October 2018 Thanks to Astrid Usong, Stanford
  13. 13. Discogs -- External Source Data as Lookup Recall - rarely start from scratch Cataloging old 45's at Cornell Exploring use of Discogs to generate base record directly integrated with the catalog editor tool
  14. 14. 1 2 3
  15. 15. Community Scale Experiments & Challenges ➔ 15 organizations in LD4P2 cohort + project partners ➔ Test editor and lookup infrastructure in a number of cataloging projects Caching needed because (most) authority sources don't provide sufficient and stable infrastructure for lookups (also associated validation, cleaning, transformation for non-LD sources) Static vs dynamic ➔ caching for static but need live query if one expects catalogers to create new entities in "real time" and then be able see them ➔ e.g. Wikidata - try against SPARQL API
  16. 16. Discovery Experiments Primary purpose of library knowledge graph is to enable discovery of library resources -- the benefits of linked data are so far unproven ➔ Parallels with ideas for lookups and linking ➔ Indexing -- already do some light inferencing from MARC into Solr (e.g. broader terms, alternates). What other data inclusion or inference is useful? ➔ Individual libraries too small to develop search systems. Considerable effort around a Solr/Ruby system called Blacklight where UI interactions studied/improved together. What is broadly reusable? ➔ Most linked data UIs are awful! What good examples we might learn from? LD4 Discovery Affinity Group having open biweekly calls
  17. 17. Thanks for listening! http://ld4p.org/ simeon.warner@cornell.edu @zimeon