This document discusses creating a knowledge graph for Irish history as part of the Beyond 2022 project. It will include digitized records from core partners documenting seven centuries of Irish history. Entities like people, places, and organizations will be extracted from source documents and related in a knowledge graph using semantic web technologies. An ontology was created to provide historical context and meaning to the relationships between entities in Irish history. Tools will be developed to explore and search the knowledge graph to advance historical research.
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
Creating a Virtual Record Treasury of Irish History
1. Beyond 2022: Creating a Knowledge Graph for
Irish History
Lynn Kilgallon, Fabrizio Orlandi
Introduction to Beyond 2022
The Knowledge Graph for Irish History
Beyond 2022 Ontology
Demo & Next Steps
3. Public Record Office of Ireland:
Record Treasury interior
(Mills Album (1915) National Archives of Ireland)
Beyond 2022 - Core Project:
The Virtual Record Treasury of Ireland = open access, online research
platform and trusted digital repository supporting:
• An inventory of Loss and Survival (database) from the 1922 fire
• Digitised image collection from the archives of core partners and
participating institutions
• Searchable content across manuscript and text records of 50+
million words, documenting seven centuries of Irish History
• A knowledge base for Irish history [RDF/
XML]
• Digital Text editions fully reconstructing documents from multiple
sources
• A 3-D immersive experience providing next generation access to
the virtual reality/augmented reality model of the Public Record
Office of Ireland and replacement collections.
4.
5. Knowledge Graph for Irish History
Knowledge graph (KG) ~ a set of interconnected entities + their attributes and relationships
Semantic web technologies ~ creating machine-interpretable data to help advance historical research
Open linked data ~ allows us to publish and link our information with information outside the system
6. Knowledge graph ‘entities’
People, places, offices, or organizations which appear in our historical records
People Organizations
Offices
Places
9. Simple Example...
Image source: https://www.w3.org/TR/rdf11-primer/
9
Subject (node) Predicate (edge) Object (node)
BOB is a friend of Alice
BOB is interested in The Mona Lisa
The Mona Lisa was created by Leonardo Da Vinci
… … …
11. Knowledge Graphs
A knowledge graph (KG) is a set of interconnected typed
entities and their attributes and relationships. The types
and relations are provided by vocabularies.
We use the Resource Description Framework (RDF),
allowing us to link our information with information outside
the system ~ Linked Data.
The advantages of using open (W3C) Web standards are the
availability of existing tooling and interoperability.
https://github.com/metaphacts/ontodia
13. Authority/source documents Entities (recorded manually in spreadsheets, or extracted automatically through NLP)
1) Identify and reveal relationships between
entities contained in source documents
2) Use a bespoke Beyond 2022 ‘ontology’ to provide historical context
and meaning to the relationships between entities in Irish history
Data ingested into KG by knowledge engineers. Now we can:
14. A Knowledge Graph for Irish History
Beyond 2022 Database Content Ingestion
15. Generating a meaningful Knowledge Graph
Vocabularies: CIDOC-CRM, PROV-O, and the Beyond 2022
ontology.
KG Generation Process:
1. Check spreadsheets (input) for mistakes with CSVW
2. Use R2RML to map spreadsheets to RDF
3. Use SHACL to look for errors in the RDF (output); both
“locally” and “globally”
4. Store the RDF in a triplestore (Blazegraph)
The whole process is driven by W3C Recommendations,
facilitating interoperability and sustainability.
... Office ...
... treasurer of Ireland ...
16. Beyond 2022 Ontology
Contains:
TYPES, CONCEPTS, RELATIONSHIPS, CLASSES
Informs how our data is structured and organized, but also allows for reasoning
When instances (entities) are combined with an ontology, the result is a Knowledge Graph
17. Beyond 2022 Ontology
By creating a comprehensive hierarchical
structure, Beyond 2022’s Ontology will
allow:
- historians to assign
‘types’/’classes’ to entities
- the KG to place them within a larger
hierarchical structure
- the KG to automatically ‘reason’
what these hierarchies are using a
‘bottom up’ approach
(i.e. based on one or more basic
‘types’ assigned to an individual
entity)
22. Linked Data Frontend - LodView
- A Java web application based on Spring and Jena - https://github.com/LodLive/LodView
- W3C standard compliant IRI dereferenciation
- On top of a SPARQL endpoint, to publish RDF data according to all defined standards for Linked Open Data
23. Linked Data Frontend - LodView
- A Java web application based on Spring and Jena
- W3C standard compliant IRI dereferenciation
- On top of a SPARQL endpoint, to publish RDF data according to all defined standards for Linked Open Data
24. Free-text search
- A customisable tool for free-text search over SPARQL endpoints (https://github.com/opencitations/oscar)
- For project team members, to facilitate their work on interlinking and extending the KG
25.
26.
27. T. Kaushik, F. Orlandi, D. Graux: https://github.com/Beyond-2022/Beyond-2022.github.io
28. Thar 2022 Amach: Maoinchiste Annála Shamalta na hÉireann
Beyond 2022: Ireland’s Virtual Record Treasury
Thank you!
https://beyond2022.ie/