SlideShare une entreprise Scribd logo
1  sur  68
Linked Open Data
Ontologies, Datasets, Projects
CLADA-BG Kickoff, 9 Nov 2018
Vladimir Alexiev, PhD, PMP
Outline
o What is LOD
o ONTO Intro
o ONTO Projects
o CH Ontologies
o CH Projects
o CH Datasets
What is LOD
o Semantic Technologies is a bit of a misnomer
o It's about exposing datasets to machines, not giving a "higher meaning" to data
o Ontologies is a bit of a misnomer
o We're not philosophers and ontologies are not very complex (or, few large datasets use
complex ontologies)
o Ontologies are the data schemas of the semantic web, but not the most important (Shapes
and Application Profiles are now widely used)
o LOD is the essence
o Exposing datasets globally, making each entity/data point addressable (URL)
o "Things not strings"
o Linking them
Where did it come from?
o TimBL
proposal,
CERN,
1989: both
Web and
Semantic
Web
o "Vague but
Exciting"
What does LOD know about TimBL?
o TimBL at
Wikidata
Reasonator
o Names in 50
languages
o Description is
auto-generated
o Parents
confirmed 3
times (with
different details
not shown)
What does LOD know about TimBL?
o Depth of
Information
on TimBL
o Links to ~200
authority files
o Info about ~20
awards
o Life Timeline
o etc, etc
Knowledge Graphs
o Semantically Integrated KB
of a domain, e.g.
o Google KG, e.g. "Jaguar company" vs
"Jaguar cat"
o Springer Nature Science Graph
o Thomson Reuters permid Company Graph
o Microsoft Academic Graph
o Dagstuhl Seminar
Knowledge Graphs: New Directions for Knowledge Representation on the Seman
Thomson Reuters permid Company Graph
(e.g. Sirma Group)
Microsoft Academic Graph (e.g. Ontotext)
Wikidata: EHRI Camps and Ghettos
Wikidata: 98k paintings with image
Wikidata: Number of Creative Works and
Cultural Institutions (SQID)
LOD Cloud
Linguistic LOD Cloud
Outline
o What is LOD
o ONTO Intro
o ONTO Projects
o CH Ontologies
o CH Projects
o CH Datasets
ONTO History and Essential Facts
o Semantic Web pioneer
o Started in 2000 as a research lab, spun-off and took VC investment in 2008
o 65 staff: 7 PhD, 30 MS, 20 BS, 6 university lecturers
o Over 400 person-years invested in R&D
o Part of Sirma Group Holding: largest Bulgarian software house
o Public company: BSE:SKK, part of SOFIX
o ONTO is core part of Sirma Strategy 2022 with focus on cognitive computing
o Member of multiple industry bodies
o W3C, EDMC, ODI, LDBC, STI, DBPedia Foundation
ONTO Innovation Awards
o Innovative Enterprise of the Year 2017
o EU Innovation Radar Prize 2016 nomination
o BAIT Business Innovation Award 2014
o Innovative Enterprise of the Year 2014
o Washington Post “Destination Innovation” Competition 2014 Award
o Pythagoras Award 2010 for most successful company in EU FP6 projects
Outline
o ONTO Intro
o ONTO Projects
o CH Ontologies
o CH Projects
o CH Datasets
ONTO Innovation (R&D) Projects
• Innovation and Consulting Unit
• More EU research projects than some BG universities
combined
• Consulting projects for banks, cultural heritage institutions,
government institutions, pharmaceuticals
• Focus: semantic data integration, text extraction
• Vertical domains
• Cultural heritage (Europeana Creative, Food and Drink,
EHRI2)
• Companies (EBG, CIMA), innovation (TRR, InnoRate), real
estate data (PDM), agriculture (BigDataGrapes)
• Media/Publishing (TrendMiner, Multisensor, Evala)
• Fact & rumour checking (Pheme, WeVerify)
• Life Science (LarKC, KHRESMOI, KConnect)
Great Variety of Application Domains
EHRI2 European Holocaust Research Infrastructure: transform archival research on the
Holocaust
Evala Cognitive And Semantic Links Analysis and Media Evaluation Platform
euBusinessGraph Enabling the European Business Graph for Innovative Data Products and Services
COMPACT From Research to Policy through Raising Awareness of the State of The Art on Social
Media and Convergence
BigDataGrapes Big Data to Enable Global Disruption of the Grapevine-powered Industries
CIMA Intelligent Matching and Linking of Company Data
Cleopatra Maria Sklodowska-Curie Action: Initial Training Network: Cross-lingual Event-centric
Open Analytics Research Academy
TRR Tracking of FP7 Research Results
WeVerify Factchecking against false news
ExaMode EXtreme-scale Analytics via Multimodal Ontology Discovery & Enhancement
InnoRate Data-driven tools for supporting and improving the decision-making processes of
investors for financing innovative SMEs
ONTO Cultural Heritage Projects
ONTO Clients (selection)
↗ User-friendly DB admin and querying
• GraphDB Workbench
↗ REST API for database access
↗ Plugin / Connectors
• GraphDB Engine
GraphDB Semantic Database
OntoRefine: Uplift Tabular Data to LOD
o Easily clean and
import tabular data
o View as RDF in real-time
with virtual SPARQL
endpoint
o Transform using JS & SPIN
o Import newly created RDF
directly to GraphDB
o Usage
o Financial data
o Agricultural data
o CH data, etc
Outline
o ONTO Intro, Products, Clients
o ONTO Projects
o CH Ontologies
o CH Projects
o CH Datasets
o W3C OA Specifications
o Web Annotation Data Model: description
of ontology, use cases and combinations
o Web Annotation Protocol: defines the
interaction between annotation servers
and annotation clients
o Selectors and States: how to select part of
a resource (e.g. section of HTML
document, rectangle from a PNG image,
structural part of a SVG image, page of a
PDF) or specify a particular version of a
resource as it existed at a certain time.
o Embedding Web Annotations in HTML.
Web Annotation o Implementations
o Annotorious image and text annotator by
Austrian Institute of Technology, developed as part of
EuropeanaConnect
o Lorestore server and Annotator OA client by
University of Queensland, Australia
o OACVideoAnnotator by UMD MITH and
Alexander Street Press
o LombardPress annotator of ancient manuscripts that
works over canonic text representations in the
Scholastic Commentaries and Texts Archive
o Annotopia by MIND Informatics group, Massachusetts
General Hospital
o Hypothes.is, largest OA project and development
community. Implements the core AnnotatorJS project. A
number of tools, plug-ins and integrations are available,
including Drupal, WordPress and Omeka integrations.
Omeka is a popular light-weight CMS and virtual
exhibition system
o MangoServer
o Wellcome Quilt, funded by the Wellcome Trust
o Europeana Annotation Server
o Mirador client, a well-known IIIF viewer
o etc etc
OA Example: Bookmarking and Semantic Tagging (Life Science)
OA Example: Annotating SVG Part of Image (ResearchSpace)
o iiif.io Specifications
o Image: semantic description of images
(available resolutions, features, credit line,
conformance level, etc) and serving features
(zooming, gray-scaling, cropping, rotation,
etc)
o Presentation (Shared Canvas): laying
images side by side, assembling folios and
books (using so-called IIIF Manifests), image
annotation. Very popular for virtual
reconstruction of manuscripts, book
viewers, etc
o Authentication: modes or interaction
patterns for getting access to protected
resources (e.g. Login, Click-through, Kiosk,
External authentication)
o Search: search of full-text embedded or
related to image resources (e.g. OCRed or
manually annotated text of some old book)
International Image Interoperability Framework (IIIF)
o IIIF Client Implementations
o Diva.js, especially suited for use in archival book digitization initiatives
o IIPMooViewer, for image streaming and zooming
o Mirador, implementing a workspace that enables comparison of multiple
images from multiple repositories, widely used for manuscripts
o OpenSeadragon, enabling smooth deep zoom and pan
o Leaflet-IIIF, a plugin for the Leaflet framework that also includes display of
geographic maps
o Universal Viewer, widely used by CH institutions
o IIIF Server Implementations
o Cantaloupe, enabling on-demand generation of image derivatives
o IIPImage Server, fast C++ server also used for scientific imagery such as
multispectral or hyperspectral images
o Loris, a server written in Python
o ContentDM, a full-featured digital collection management (DAM) system
o Djatoka, a Java-based image server
o Digilib, another Java-based image server
IIIF Example: Mirador at Biblissima (French manuscript library)
IIIF Example: Search IIIF Images on Europeana
o CIDOC CRM
o Pros: strong foundational ontology, used by
numerous projects especially in Europe.
o Cons: many consider it complicated, some
shortcomings for describing relations
between people and between objects, not
friendly for integrating with other
ontologies, the community (SIG) is slow to
adopt practically important issues, few
application profiles for specific kinds of
objects (e.g. coins vs paintings).
o linked.art
o Pros: a simplified CRM profile created under
the moniker "Linked Open Usable Data
(LOUD)", more developer friendly through
an emphasis on JSONLD, used by some
projects especially in the US.
o Cons: various simplifications that are not
vetted by the CRM SIG, rift with European
CRM developments.
Most Relevant Museum Ontologies
o Schema.org
o Pros: supported by the major search engines thus
ensures semantic SEO and findability, used by the
largest amount of LOD (on billions of websites),
pragmatic and collaborative process for data
modeling with a lot of examples, possible
extensions as exemplified by bibliographic
(SchemaBibEx) and archival extension.
o Cons: not yet proven it is sufficient to represent
rich museum data
o Wikidata
o Pros: universal platform for data integration,
richer model than RDF (but also exposed as RDF),
pragmatic and versatile collaborative process for
data modeling (property creation) with a lot of
examples and justifications, used by some GLAMs
and crowd-sourced projects (e.g. Authority
Control, Sum of All Paintings, Wiki Loves
Monuments).
o Cons: institutional endorsement is not yet strong
enough, concerns of institutions how they can be
masters of "their own" data.
CIDOC CRM
o Conceptual Reference Model (CRM)
o By ICOM, International Committee for Documentation (CIDOC), CRM SIG
o In development for 17 years (since 1999)
o Standardized as ISO 21127:2006 in 2006, continues to evolve
o Current version: CRM 6.2.1 (Oct 2015), version in progress CRM 6.2.3 (May 2018).
o Foundational ontology for history, archeology and art.
o About 85 classes
o About 285 properties (140 object properties and their inverses, and a few that don’t have
inverses)
CRM Graphical Representation : Index
CRM Classes
CRM Graphical: Mark and Inscription Information (part 1)
Apply CRM: Model Coins
o E22_Man-Made_Object
o standardized P2_has_type (e.g. Coin from AAT or more specific from Nomisma)
o P56_bears_feature E25_Man-Made_Feature
o P43_has_dimension E54_Dimension with P2_has_type (e.g. die axis), P91_has_unit (e.g. "o'clock"), P90_has_value
o E25_Man-Made_Feature
o standardized P2_has_type: Obverse or Reverse
o P65_carries_visual_item E38_Image (e.g. of a ruler) and/or E34_Inscription (text)
o E38_Image
o P138_represents (e.g. some ruler from ULAN, or e.g. "laurel wreath" from AAT)
o E34_Inscription
o P3_has_note "the text"
o and P72_has_language (e.g. Latin from AAT)
o optionally P73_has_translation to another Linguistic Object
CRM Time Spans
CRM property Meaning Latin phrase Meaning
P82a_begin_of_the_begin started after this moment terminus post quem limit after which
P81a_end_of_the_begin started before this moment terminus a quo limit from which
P81b_begin_of_the_end finished after this moment terminus ad quem limit to which
P82b_end_of_the_end finished before this moment terminus ante quem limit before which
CRM Extensions
o FRBRoo: bibliographic information following FRBR principles (Work-Expression-
Manifestation-Item), artistic performances and their recordings
o PRESoo: periodic publications
o DoReMus: music and performances
o CRMdig: digitization processes and provenance metadata
o CRMinf: statements, argumentation, beliefs
o CRMsci: scientific observations
o CRMgeo: spatiotemporal modeling by integrating CRM to GeoSPARQL
o Parthenos Entities: research objects, software, datasets
o CRMeh (English Heritage): archeology
o CRMarchaeo: archeology, excavation, stratigraphy
o CRMba: buildings
o CRMx: proposed extension for museum objects, including simple properties such as
main depiction of an object, preferred title, extent, etc
Outline
o ONTO Intro, Products, Clients
o ONTO Projects
o CH Ontologies
o CH Projects, Datasets
o Started in 2008
o Has aggregated 53M objects at present
o Perhaps 50-70 Europeana-related projects
o Currently supported by Connecting Europe Facility
as a Digital Service Infrastructure
o Uses Europeana Data Model (EDM), an RDF
ontology
o General search and display mechanism
o The search is not semantic (e.g. won't catch
different multilingual names, unless they are
included in enriched object data)
o A set of fixed facets (including image
characteristics).
Europeana
o Europe (and beyond) GLAM Networking
o Foundation: does the work, ~50 staff
o Association: elections, 2066 members, 75
countries, 19 from BG
o Members Council (36, growing to 50): sets
strategy
o Task Forces: tech guidelines, temporary
o Work Groups: tech guidelines, more
permanent
o Data Quality Council: reflects new strategy
Europeana: Search Paintings of Cupid
Europeana Collections: a "Personal Face"
Europeana Food and Drink: ONTO sem app
Europeana Labs: Galleries of Apps and Datasets
Europeana Data Access: API, OAI PMH, SPARQL
EDM: Typical Graph (from ONTO SPARQL
endpoint)
o Project
o Started 2009, ongoing
o Funded by Mellon Foundation
o Led by the British Museum
o Followed by Yale Center for British Art
(YCBA) and Smithsonian American Art
Museum (SAAM)
o Initial implementation: ONTO, System
Simulation
o Current implementation: Metaphacts
o Additional Involvement: FORTH, Delving
British Museum ResearchSpace
o VRE for Art Research
o CIDOC CRM representation
o Powerful semantic search, saved searches
o Image annotation
o Data basket
o Argumentation
o Intends to be a generic art research
system that can be adapted for various
needs and projects
ResearchSpace: Map British Museum Data to CIDOC CRM
Fundamental Relation "Thing From Place"
CRM Semantic Search
CRM Search: Hierarchical Query Expansion
ConservationSpace: Core System for
Conservationistso Project
o Mellon Funding
o Led by US National Gallery of Art
o Implemented by Sirma Enterprise
o Supported by ONTO
o Uses GraphDB
o Based on Sirma Enterprise Platform
o Sirma MuseumSpace: curation/collection management,
exhibition and loan management, conservation management...
o Semantic integration, enrichment and publication of CH data
o Digital Asset Management
o Thesaurus Management
o Paper-less office (Sirma GO Digital)
o Contract management
o ISO 9001 QMS document management
o Project
o 2-year project (Oct 2015-Nov 2017)
o Mellon funded
o 14 US museums and galleries
o Publish their data to RDF
o ONTO consulted on semantic mapping and data publishing
o Worked alongside two Getty staff (semantic architect and
data architect)
o Publications
o Lessons Learned in Building Linked Data for the American
Art Collaborative, C.Knoblock et al, ISWC 2017: project
challenges, volumetrics and semantic conversion
experience
o American Art Collaborative (AAC) Linked Open Data (LOD)
Initiative: Overview and Recommendations for Good
Practices. E. Fink, 2018
American Art Collaborative
o Achievements
o Aggregated artwork data from 14 institutions: 233,666
Objects, 28,882 Artists and 20,446 other agents (Related
Parties)
o Made about 15M triples. (For comparison, the British Museum
semantic data comprises 2.5M objects and 960M triples.)
o Used a harmonized data model so the data can be shown
together.
o Harmonized not only data models but also value sets to AAT
o Linked per-institution artists to ULAN
o Raised LOD awareness with the target institutions and a wider
audience and mobilized inter-institutional collaboration.
o Some of the institutions took charge of their transformations
to establish a sustainable LOD publication process.
o Created excellent use cases and UI mockups for browsing and
exploration, e.g. comparing artists by style, material and
genres; artwork timelines, etc.
AAC Target Mapping: Actor Gender
AAC Artwork View
AAC Partner Statistics
o Created as a post-product of AAC
o Application profile for CRM i.e. a particular way of
using CRM.
o Created out of frustration with the complications
of applying CRM, promoted under the moniker
Linked Open Usable Data (LOUD).
o Uses CIDOC-CRM as the core ontology, giving an
event-based paradigm
o Uses the Getty Vocabularies as core sources of
identity, i.e. specific object types (e.g. painting),
activity types (e.g. book binding, gilding, etching),
title types (e.g. artists vs repository title), etc
o JSON-LD as primary RDF serialization. Being JSON,
it is more developer-friendly than other
serializations.
linked.art
o Large number of examples (model
components). Count per area:
o 42 activity,
o 1 concept,
o 2 group,
o 2 identifier,
o 2 legal,
o 1 name,
o 46 object,
o 12 person,
o 6 place,
o 7 set,
o 11 text,
o 2 value
linked.art Representation of Traveling Exhibition: rdfpuml Diagram
linked.art Traveling Exhibition: JSON-LD (left) vs Turtle (right)
Getty Vocabulary Program LOD
GVP LOD Project
o Timeline
o Art and Architecture Thesaurus (AAT): 2014-02
o Thesaurus of Geographic Names (TGN): 2014-08
o Union List of Artist Names (ULAN): 2015-03
o ONTO Services
o Semantic/ontology development
o Contributed to the ISO 25964 ontology (latest standard on thesauri), provided implementation experience, suggestions and fixes.
o Published on varieties of Broader relations (BTG, BTP, BTI)
o Complete mapping specification, comprehensive documentation
o Helped implement R2RML scripts working off Getty's Oracle database, contribution to Perl implementation (RDB2RDF), R2RML
extension (rrx:languageColumn)
o GraphDB semantic repository, clustered for high-availability
o Semantic application development (customized user interface), technical consulting
o SPARQL 1.1 compliant endpoint, sample queries
o Per-entity export files, explicit/total data dumps
o Semantic dataset description (VOID)
o Help desk / support on twitter and google group (continuing)
o GVP LOD is widely regarded as a good example to be followed by GLAMs
Jan
GVP LOD Ontology
GVP Specialized Hierarchical Relation Inference
GVP Ordered Guide Term, represented as iso:ThesaurusArray
GVP Documentation: Contents
GVP Sample Queries UI
If you have any questions or suggestions,
please email Vladimir.Alexiev@Ontotext.com
Thank you
for your attention!

Contenu connexe

Similaire à Linked Open Data and Ontotext Projects

Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects Vladimir Alexiev, PhD, PMP
 
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Vladimir Alexiev, PhD, PMP
 
The Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open DataThe Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open DataDavid Haskiya
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us? Andrea Volpini
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...The Research Council of Norway, IKTPLUSS
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...João Rocha da Silva
 
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Peter Löwe
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebMathieu d'Aquin
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
 
Publishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of OntologiesPublishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of OntologiesMaría Poveda Villalón
 
OER World Map Project
OER World Map Project OER World Map Project
OER World Map Project Robert Farrow
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleMartin Kaltenböck
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for BiopharmaTom Plasterer
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudDhaval Thakker
 
Why SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data StrategyWhy SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data StrategySemantic Web Company
 
A novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applicationsA novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applicationsHoopeer Hoopeer
 

Similaire à Linked Open Data and Ontotext Projects (20)

Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects
 
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
 
The Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open DataThe Europeana Strategy and Linked Open Data
The Europeana Strategy and Linked Open Data
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...The Dendro research data management platform: Applying ontologies to long-ter...
The Dendro research data management platform: Applying ontologies to long-ter...
 
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
Libraries in the Big Data Era: Strategies and Challenges in Archiving and Sha...
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic Web
 
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingAnalytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
 
Ee bdm ws-v1
Ee bdm ws-v1Ee bdm ws-v1
Ee bdm ws-v1
 
Publishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of OntologiesPublishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of Ontologies
 
OER World Map Project
OER World Map Project OER World Map Project
OER World Map Project
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
 
Lod2
Lod2Lod2
Lod2
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
Why SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data StrategyWhy SKOS should be a Focal Point of your Linked Data Strategy
Why SKOS should be a Focal Point of your Linked Data Strategy
 
A novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applicationsA novel programmable attenuator based low Gm-OTA for biomedical applications
A novel programmable attenuator based low Gm-OTA for biomedical applications
 

Plus de Vladimir Alexiev, PhD, PMP

Ontotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities ProjectsOntotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities ProjectsVladimir Alexiev, PhD, PMP
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Vladimir Alexiev, PhD, PMP
 
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Vladimir Alexiev, PhD, PMP
 
Europeana Food and Drink Classification Scheme
Europeana Food and Drink Classification SchemeEuropeana Food and Drink Classification Scheme
Europeana Food and Drink Classification SchemeVladimir Alexiev, PhD, PMP
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsVladimir Alexiev, PhD, PMP
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsVladimir Alexiev, PhD, PMP
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...Vladimir Alexiev, PhD, PMP
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...Vladimir Alexiev, PhD, PMP
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)Vladimir Alexiev, PhD, PMP
 
RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)Vladimir Alexiev, PhD, PMP
 
Ontotext short presentation at LODLAM Summit 2013
Ontotext short presentation at LODLAM Summit 2013Ontotext short presentation at LODLAM Summit 2013
Ontotext short presentation at LODLAM Summit 2013Vladimir Alexiev, PhD, PMP
 

Plus de Vladimir Alexiev, PhD, PMP (20)

Ontotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities ProjectsOntotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities Projects
 
euBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic DataeuBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic Data
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
 
GLAMs working with Wikidata
GLAMs working with WikidataGLAMs working with Wikidata
GLAMs working with Wikidata
 
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
 
Europeana Food and Drink Classification Scheme
Europeana Food and Drink Classification SchemeEuropeana Food and Drink Classification Scheme
Europeana Food and Drink Classification Scheme
 
Adding a DBpedia Mapping
Adding a DBpedia MappingAdding a DBpedia Mapping
Adding a DBpedia Mapping
 
DBpedia Ontology and Mapping Problems
DBpedia Ontology and Mapping ProblemsDBpedia Ontology and Mapping Problems
DBpedia Ontology and Mapping Problems
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Semantic Technology in Publishing & Finance
Semantic Technology in Publishing & FinanceSemantic Technology in Publishing & Finance
Semantic Technology in Publishing & Finance
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Semantic technologies for cultural heritage
Semantic technologies for cultural heritageSemantic technologies for cultural heritage
Semantic technologies for cultural heritage
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom Views
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom Views
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 
RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)RDF Data and Image Annotations in ResearchSpace (slides)
RDF Data and Image Annotations in ResearchSpace (slides)
 
RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)
 
Ontotext short presentation at LODLAM Summit 2013
Ontotext short presentation at LODLAM Summit 2013Ontotext short presentation at LODLAM Summit 2013
Ontotext short presentation at LODLAM Summit 2013
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Linked Open Data and Ontotext Projects

  • 1. Linked Open Data Ontologies, Datasets, Projects CLADA-BG Kickoff, 9 Nov 2018 Vladimir Alexiev, PhD, PMP
  • 2. Outline o What is LOD o ONTO Intro o ONTO Projects o CH Ontologies o CH Projects o CH Datasets
  • 3. What is LOD o Semantic Technologies is a bit of a misnomer o It's about exposing datasets to machines, not giving a "higher meaning" to data o Ontologies is a bit of a misnomer o We're not philosophers and ontologies are not very complex (or, few large datasets use complex ontologies) o Ontologies are the data schemas of the semantic web, but not the most important (Shapes and Application Profiles are now widely used) o LOD is the essence o Exposing datasets globally, making each entity/data point addressable (URL) o "Things not strings" o Linking them
  • 4. Where did it come from? o TimBL proposal, CERN, 1989: both Web and Semantic Web o "Vague but Exciting"
  • 5. What does LOD know about TimBL? o TimBL at Wikidata Reasonator o Names in 50 languages o Description is auto-generated o Parents confirmed 3 times (with different details not shown)
  • 6. What does LOD know about TimBL? o Depth of Information on TimBL o Links to ~200 authority files o Info about ~20 awards o Life Timeline o etc, etc
  • 7. Knowledge Graphs o Semantically Integrated KB of a domain, e.g. o Google KG, e.g. "Jaguar company" vs "Jaguar cat" o Springer Nature Science Graph o Thomson Reuters permid Company Graph o Microsoft Academic Graph o Dagstuhl Seminar Knowledge Graphs: New Directions for Knowledge Representation on the Seman
  • 8. Thomson Reuters permid Company Graph (e.g. Sirma Group)
  • 9. Microsoft Academic Graph (e.g. Ontotext)
  • 10. Wikidata: EHRI Camps and Ghettos
  • 12. Wikidata: Number of Creative Works and Cultural Institutions (SQID)
  • 15. Outline o What is LOD o ONTO Intro o ONTO Projects o CH Ontologies o CH Projects o CH Datasets
  • 16. ONTO History and Essential Facts o Semantic Web pioneer o Started in 2000 as a research lab, spun-off and took VC investment in 2008 o 65 staff: 7 PhD, 30 MS, 20 BS, 6 university lecturers o Over 400 person-years invested in R&D o Part of Sirma Group Holding: largest Bulgarian software house o Public company: BSE:SKK, part of SOFIX o ONTO is core part of Sirma Strategy 2022 with focus on cognitive computing o Member of multiple industry bodies o W3C, EDMC, ODI, LDBC, STI, DBPedia Foundation
  • 17. ONTO Innovation Awards o Innovative Enterprise of the Year 2017 o EU Innovation Radar Prize 2016 nomination o BAIT Business Innovation Award 2014 o Innovative Enterprise of the Year 2014 o Washington Post “Destination Innovation” Competition 2014 Award o Pythagoras Award 2010 for most successful company in EU FP6 projects
  • 18. Outline o ONTO Intro o ONTO Projects o CH Ontologies o CH Projects o CH Datasets
  • 19. ONTO Innovation (R&D) Projects • Innovation and Consulting Unit • More EU research projects than some BG universities combined • Consulting projects for banks, cultural heritage institutions, government institutions, pharmaceuticals • Focus: semantic data integration, text extraction • Vertical domains • Cultural heritage (Europeana Creative, Food and Drink, EHRI2) • Companies (EBG, CIMA), innovation (TRR, InnoRate), real estate data (PDM), agriculture (BigDataGrapes) • Media/Publishing (TrendMiner, Multisensor, Evala) • Fact & rumour checking (Pheme, WeVerify) • Life Science (LarKC, KHRESMOI, KConnect)
  • 20. Great Variety of Application Domains EHRI2 European Holocaust Research Infrastructure: transform archival research on the Holocaust Evala Cognitive And Semantic Links Analysis and Media Evaluation Platform euBusinessGraph Enabling the European Business Graph for Innovative Data Products and Services COMPACT From Research to Policy through Raising Awareness of the State of The Art on Social Media and Convergence BigDataGrapes Big Data to Enable Global Disruption of the Grapevine-powered Industries CIMA Intelligent Matching and Linking of Company Data Cleopatra Maria Sklodowska-Curie Action: Initial Training Network: Cross-lingual Event-centric Open Analytics Research Academy TRR Tracking of FP7 Research Results WeVerify Factchecking against false news ExaMode EXtreme-scale Analytics via Multimodal Ontology Discovery & Enhancement InnoRate Data-driven tools for supporting and improving the decision-making processes of investors for financing innovative SMEs
  • 22.
  • 24. ↗ User-friendly DB admin and querying • GraphDB Workbench ↗ REST API for database access ↗ Plugin / Connectors • GraphDB Engine GraphDB Semantic Database
  • 25. OntoRefine: Uplift Tabular Data to LOD o Easily clean and import tabular data o View as RDF in real-time with virtual SPARQL endpoint o Transform using JS & SPIN o Import newly created RDF directly to GraphDB o Usage o Financial data o Agricultural data o CH data, etc
  • 26. Outline o ONTO Intro, Products, Clients o ONTO Projects o CH Ontologies o CH Projects o CH Datasets
  • 27. o W3C OA Specifications o Web Annotation Data Model: description of ontology, use cases and combinations o Web Annotation Protocol: defines the interaction between annotation servers and annotation clients o Selectors and States: how to select part of a resource (e.g. section of HTML document, rectangle from a PNG image, structural part of a SVG image, page of a PDF) or specify a particular version of a resource as it existed at a certain time. o Embedding Web Annotations in HTML. Web Annotation o Implementations o Annotorious image and text annotator by Austrian Institute of Technology, developed as part of EuropeanaConnect o Lorestore server and Annotator OA client by University of Queensland, Australia o OACVideoAnnotator by UMD MITH and Alexander Street Press o LombardPress annotator of ancient manuscripts that works over canonic text representations in the Scholastic Commentaries and Texts Archive o Annotopia by MIND Informatics group, Massachusetts General Hospital o Hypothes.is, largest OA project and development community. Implements the core AnnotatorJS project. A number of tools, plug-ins and integrations are available, including Drupal, WordPress and Omeka integrations. Omeka is a popular light-weight CMS and virtual exhibition system o MangoServer o Wellcome Quilt, funded by the Wellcome Trust o Europeana Annotation Server o Mirador client, a well-known IIIF viewer o etc etc
  • 28. OA Example: Bookmarking and Semantic Tagging (Life Science)
  • 29. OA Example: Annotating SVG Part of Image (ResearchSpace)
  • 30. o iiif.io Specifications o Image: semantic description of images (available resolutions, features, credit line, conformance level, etc) and serving features (zooming, gray-scaling, cropping, rotation, etc) o Presentation (Shared Canvas): laying images side by side, assembling folios and books (using so-called IIIF Manifests), image annotation. Very popular for virtual reconstruction of manuscripts, book viewers, etc o Authentication: modes or interaction patterns for getting access to protected resources (e.g. Login, Click-through, Kiosk, External authentication) o Search: search of full-text embedded or related to image resources (e.g. OCRed or manually annotated text of some old book) International Image Interoperability Framework (IIIF) o IIIF Client Implementations o Diva.js, especially suited for use in archival book digitization initiatives o IIPMooViewer, for image streaming and zooming o Mirador, implementing a workspace that enables comparison of multiple images from multiple repositories, widely used for manuscripts o OpenSeadragon, enabling smooth deep zoom and pan o Leaflet-IIIF, a plugin for the Leaflet framework that also includes display of geographic maps o Universal Viewer, widely used by CH institutions o IIIF Server Implementations o Cantaloupe, enabling on-demand generation of image derivatives o IIPImage Server, fast C++ server also used for scientific imagery such as multispectral or hyperspectral images o Loris, a server written in Python o ContentDM, a full-featured digital collection management (DAM) system o Djatoka, a Java-based image server o Digilib, another Java-based image server
  • 31. IIIF Example: Mirador at Biblissima (French manuscript library)
  • 32. IIIF Example: Search IIIF Images on Europeana
  • 33. o CIDOC CRM o Pros: strong foundational ontology, used by numerous projects especially in Europe. o Cons: many consider it complicated, some shortcomings for describing relations between people and between objects, not friendly for integrating with other ontologies, the community (SIG) is slow to adopt practically important issues, few application profiles for specific kinds of objects (e.g. coins vs paintings). o linked.art o Pros: a simplified CRM profile created under the moniker "Linked Open Usable Data (LOUD)", more developer friendly through an emphasis on JSONLD, used by some projects especially in the US. o Cons: various simplifications that are not vetted by the CRM SIG, rift with European CRM developments. Most Relevant Museum Ontologies o Schema.org o Pros: supported by the major search engines thus ensures semantic SEO and findability, used by the largest amount of LOD (on billions of websites), pragmatic and collaborative process for data modeling with a lot of examples, possible extensions as exemplified by bibliographic (SchemaBibEx) and archival extension. o Cons: not yet proven it is sufficient to represent rich museum data o Wikidata o Pros: universal platform for data integration, richer model than RDF (but also exposed as RDF), pragmatic and versatile collaborative process for data modeling (property creation) with a lot of examples and justifications, used by some GLAMs and crowd-sourced projects (e.g. Authority Control, Sum of All Paintings, Wiki Loves Monuments). o Cons: institutional endorsement is not yet strong enough, concerns of institutions how they can be masters of "their own" data.
  • 34. CIDOC CRM o Conceptual Reference Model (CRM) o By ICOM, International Committee for Documentation (CIDOC), CRM SIG o In development for 17 years (since 1999) o Standardized as ISO 21127:2006 in 2006, continues to evolve o Current version: CRM 6.2.1 (Oct 2015), version in progress CRM 6.2.3 (May 2018). o Foundational ontology for history, archeology and art. o About 85 classes o About 285 properties (140 object properties and their inverses, and a few that don’t have inverses)
  • 37. CRM Graphical: Mark and Inscription Information (part 1)
  • 38. Apply CRM: Model Coins o E22_Man-Made_Object o standardized P2_has_type (e.g. Coin from AAT or more specific from Nomisma) o P56_bears_feature E25_Man-Made_Feature o P43_has_dimension E54_Dimension with P2_has_type (e.g. die axis), P91_has_unit (e.g. "o'clock"), P90_has_value o E25_Man-Made_Feature o standardized P2_has_type: Obverse or Reverse o P65_carries_visual_item E38_Image (e.g. of a ruler) and/or E34_Inscription (text) o E38_Image o P138_represents (e.g. some ruler from ULAN, or e.g. "laurel wreath" from AAT) o E34_Inscription o P3_has_note "the text" o and P72_has_language (e.g. Latin from AAT) o optionally P73_has_translation to another Linguistic Object
  • 39. CRM Time Spans CRM property Meaning Latin phrase Meaning P82a_begin_of_the_begin started after this moment terminus post quem limit after which P81a_end_of_the_begin started before this moment terminus a quo limit from which P81b_begin_of_the_end finished after this moment terminus ad quem limit to which P82b_end_of_the_end finished before this moment terminus ante quem limit before which
  • 40. CRM Extensions o FRBRoo: bibliographic information following FRBR principles (Work-Expression- Manifestation-Item), artistic performances and their recordings o PRESoo: periodic publications o DoReMus: music and performances o CRMdig: digitization processes and provenance metadata o CRMinf: statements, argumentation, beliefs o CRMsci: scientific observations o CRMgeo: spatiotemporal modeling by integrating CRM to GeoSPARQL o Parthenos Entities: research objects, software, datasets o CRMeh (English Heritage): archeology o CRMarchaeo: archeology, excavation, stratigraphy o CRMba: buildings o CRMx: proposed extension for museum objects, including simple properties such as main depiction of an object, preferred title, extent, etc
  • 41. Outline o ONTO Intro, Products, Clients o ONTO Projects o CH Ontologies o CH Projects, Datasets
  • 42. o Started in 2008 o Has aggregated 53M objects at present o Perhaps 50-70 Europeana-related projects o Currently supported by Connecting Europe Facility as a Digital Service Infrastructure o Uses Europeana Data Model (EDM), an RDF ontology o General search and display mechanism o The search is not semantic (e.g. won't catch different multilingual names, unless they are included in enriched object data) o A set of fixed facets (including image characteristics). Europeana o Europe (and beyond) GLAM Networking o Foundation: does the work, ~50 staff o Association: elections, 2066 members, 75 countries, 19 from BG o Members Council (36, growing to 50): sets strategy o Task Forces: tech guidelines, temporary o Work Groups: tech guidelines, more permanent o Data Quality Council: reflects new strategy
  • 44. Europeana Collections: a "Personal Face"
  • 45. Europeana Food and Drink: ONTO sem app
  • 46. Europeana Labs: Galleries of Apps and Datasets
  • 47. Europeana Data Access: API, OAI PMH, SPARQL
  • 48. EDM: Typical Graph (from ONTO SPARQL endpoint)
  • 49. o Project o Started 2009, ongoing o Funded by Mellon Foundation o Led by the British Museum o Followed by Yale Center for British Art (YCBA) and Smithsonian American Art Museum (SAAM) o Initial implementation: ONTO, System Simulation o Current implementation: Metaphacts o Additional Involvement: FORTH, Delving British Museum ResearchSpace o VRE for Art Research o CIDOC CRM representation o Powerful semantic search, saved searches o Image annotation o Data basket o Argumentation o Intends to be a generic art research system that can be adapted for various needs and projects
  • 50. ResearchSpace: Map British Museum Data to CIDOC CRM
  • 51. Fundamental Relation "Thing From Place" CRM Semantic Search
  • 52. CRM Search: Hierarchical Query Expansion
  • 53. ConservationSpace: Core System for Conservationistso Project o Mellon Funding o Led by US National Gallery of Art o Implemented by Sirma Enterprise o Supported by ONTO o Uses GraphDB o Based on Sirma Enterprise Platform o Sirma MuseumSpace: curation/collection management, exhibition and loan management, conservation management... o Semantic integration, enrichment and publication of CH data o Digital Asset Management o Thesaurus Management o Paper-less office (Sirma GO Digital) o Contract management o ISO 9001 QMS document management
  • 54. o Project o 2-year project (Oct 2015-Nov 2017) o Mellon funded o 14 US museums and galleries o Publish their data to RDF o ONTO consulted on semantic mapping and data publishing o Worked alongside two Getty staff (semantic architect and data architect) o Publications o Lessons Learned in Building Linked Data for the American Art Collaborative, C.Knoblock et al, ISWC 2017: project challenges, volumetrics and semantic conversion experience o American Art Collaborative (AAC) Linked Open Data (LOD) Initiative: Overview and Recommendations for Good Practices. E. Fink, 2018 American Art Collaborative o Achievements o Aggregated artwork data from 14 institutions: 233,666 Objects, 28,882 Artists and 20,446 other agents (Related Parties) o Made about 15M triples. (For comparison, the British Museum semantic data comprises 2.5M objects and 960M triples.) o Used a harmonized data model so the data can be shown together. o Harmonized not only data models but also value sets to AAT o Linked per-institution artists to ULAN o Raised LOD awareness with the target institutions and a wider audience and mobilized inter-institutional collaboration. o Some of the institutions took charge of their transformations to establish a sustainable LOD publication process. o Created excellent use cases and UI mockups for browsing and exploration, e.g. comparing artists by style, material and genres; artwork timelines, etc.
  • 55. AAC Target Mapping: Actor Gender
  • 58. o Created as a post-product of AAC o Application profile for CRM i.e. a particular way of using CRM. o Created out of frustration with the complications of applying CRM, promoted under the moniker Linked Open Usable Data (LOUD). o Uses CIDOC-CRM as the core ontology, giving an event-based paradigm o Uses the Getty Vocabularies as core sources of identity, i.e. specific object types (e.g. painting), activity types (e.g. book binding, gilding, etching), title types (e.g. artists vs repository title), etc o JSON-LD as primary RDF serialization. Being JSON, it is more developer-friendly than other serializations. linked.art o Large number of examples (model components). Count per area: o 42 activity, o 1 concept, o 2 group, o 2 identifier, o 2 legal, o 1 name, o 46 object, o 12 person, o 6 place, o 7 set, o 11 text, o 2 value
  • 59. linked.art Representation of Traveling Exhibition: rdfpuml Diagram
  • 60. linked.art Traveling Exhibition: JSON-LD (left) vs Turtle (right)
  • 62. GVP LOD Project o Timeline o Art and Architecture Thesaurus (AAT): 2014-02 o Thesaurus of Geographic Names (TGN): 2014-08 o Union List of Artist Names (ULAN): 2015-03 o ONTO Services o Semantic/ontology development o Contributed to the ISO 25964 ontology (latest standard on thesauri), provided implementation experience, suggestions and fixes. o Published on varieties of Broader relations (BTG, BTP, BTI) o Complete mapping specification, comprehensive documentation o Helped implement R2RML scripts working off Getty's Oracle database, contribution to Perl implementation (RDB2RDF), R2RML extension (rrx:languageColumn) o GraphDB semantic repository, clustered for high-availability o Semantic application development (customized user interface), technical consulting o SPARQL 1.1 compliant endpoint, sample queries o Per-entity export files, explicit/total data dumps o Semantic dataset description (VOID) o Help desk / support on twitter and google group (continuing) o GVP LOD is widely regarded as a good example to be followed by GLAMs Jan
  • 64. GVP Specialized Hierarchical Relation Inference
  • 65. GVP Ordered Guide Term, represented as iso:ThesaurusArray
  • 68. If you have any questions or suggestions, please email Vladimir.Alexiev@Ontotext.com Thank you for your attention!

Notes de l'éditeur

  1. This is elevator pitch for our overall technology approach, proposition and applications
  2. GraphDB Workbench is the administrative interface shipped with the database. It gives the users an intuitive and powerful interface to the GraphDB Server. The Server exposes all database engine APIs. Unlike most of our competitors the engine allows easy extensibility and the development of Plugins. One such example are the Connectors, which synchronizes the internal RDF database model with external services like Lucene, SOLR, Elastic search