This document summarizes a presentation given by Mariana Damova on using semantic technologies and Europeana data. It discusses how Europeana data has been converted to RDF and loaded into the OWLIM semantic graph database. This allows linking Europeana data to other datasets to enable queries across multiple sources. Examples of queries over Europeana and other cultural heritage data are provided. Future work on projects like Europeana Creative is also mentioned.
Ontotext is a leading provider of semantic technology solutions, including their OWLIM semantic database. OWLIM can handle large RDF datasets with scalable reasoning and high performance querying. It has been used successfully in cultural heritage domains by organizations like the BBC and British Museum to power semantic search and knowledge bases. OWLIM supports RDFS/OWL reasoning, spatial indexing, replication for high availability, and ranking of nodes based on semantic relationships.
This presentation introduces OWLIM semantic repository at DM2E project meeting, held in Vienna in November 2012. Ontotext entered the DM2E consortium as associated partner.
Challenges on modeling annotations in the Europeana Sounds projectHugo Manguinhas
Presented at iAnnotate16 (http://iannotate.org/)
Cultural heritage institutions are looking at crowdsourcing as a new way and opportunity to improve the overall quality of their data and contribute to a better semantic description and link to the web of data. This is also the case for Europeana, as crowdsourcing under the form of annotations is envisioned and being worked on in several projects. As part of the EU Europeana Sounds project (http://www.europeanasounds.eu/), we have identified the user stories and requirements that cover the following annotation scenarios: open and controlled tagging; geotagging, enrichment of metadata; annotation of media resources; linking to other objects; moderation and general discussions.
As a central point on all the efforts around annotations is an agreement on how these should be modelled in a uniform way for all these scenarios, as it is essential to bring such information to Europeana and in a way that can also be easily exploited and shared beyond our portal. For this, we are using the recent W3C Web Annotation Data Model (WADM) supported by the Open Annotation community as it is the most promising model at the moment.
Due to its flexible design and early stage of development, at the moment, there is insufficient recommendations on how some of our user stories and requirements can be modelled. In our presentation we will make proposals on how the WADM can be applied for these scenarios and we are looking for discussion/feedback from the community in the hope that it will help cultural heritage institutions and other communities to better understand how annotations can be modelled.
Exploring comparative evaluation of semantic enrichment tools for cultural he...Hugo Manguinhas
Semantic enrichment of metadata is an important and difficult problem for digital heritage efforts such as Europeana. This presentation gives motivations and presents the work of a recently completed Task Force that addressed the topic of evaluation of semantic enrichment. We especially report on the design and the results of a comparative evaluation experiment, where we have assessed the enrichments of seven tools (or configurations thereof) on a sample benchmark dataset from Europeana.
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...Hugo Manguinhas
This document discusses linking subject labels from cultural heritage metadata to terms in the MIMO vocabulary using the CultuurLink tool. Several cultural institutions participated in an experiment to align terms from their audio collections to MIMO. They extracted subject vocabularies from the metadata and performed alignments using CultuurLink. The results found that exact string matching identified about 50% of alignments, while more complex strategies discovered additional alignments through techniques like stemming and fuzzy matching. The experiment provided insights into CultuurLink's capabilities and coverage of the MIMO vocabulary for enriching metadata.
Designing a multilingual knowledge graph - DCMI2018Antoine Isaac
Presentation for the paper "Designing a multilingual knowledge graph as service for cultural heritage" at the DCMI2018 conference https://www.dublincore.org/conferences/2018/abstracts/#559
Are you talking to me? Researching a scenario for linking objects and publica...Ellen Van Keer
presentation of the workflow designed in the project "bridging knowledge collections", aimed at integrating an institutional repository with the online catalogs of the museum objects and library publications kept in the RMAH
This document summarizes a presentation given by Mariana Damova on using semantic technologies and Europeana data. It discusses how Europeana data has been converted to RDF and loaded into the OWLIM semantic graph database. This allows linking Europeana data to other datasets to enable queries across multiple sources. Examples of queries over Europeana and other cultural heritage data are provided. Future work on projects like Europeana Creative is also mentioned.
Ontotext is a leading provider of semantic technology solutions, including their OWLIM semantic database. OWLIM can handle large RDF datasets with scalable reasoning and high performance querying. It has been used successfully in cultural heritage domains by organizations like the BBC and British Museum to power semantic search and knowledge bases. OWLIM supports RDFS/OWL reasoning, spatial indexing, replication for high availability, and ranking of nodes based on semantic relationships.
This presentation introduces OWLIM semantic repository at DM2E project meeting, held in Vienna in November 2012. Ontotext entered the DM2E consortium as associated partner.
Challenges on modeling annotations in the Europeana Sounds projectHugo Manguinhas
Presented at iAnnotate16 (http://iannotate.org/)
Cultural heritage institutions are looking at crowdsourcing as a new way and opportunity to improve the overall quality of their data and contribute to a better semantic description and link to the web of data. This is also the case for Europeana, as crowdsourcing under the form of annotations is envisioned and being worked on in several projects. As part of the EU Europeana Sounds project (http://www.europeanasounds.eu/), we have identified the user stories and requirements that cover the following annotation scenarios: open and controlled tagging; geotagging, enrichment of metadata; annotation of media resources; linking to other objects; moderation and general discussions.
As a central point on all the efforts around annotations is an agreement on how these should be modelled in a uniform way for all these scenarios, as it is essential to bring such information to Europeana and in a way that can also be easily exploited and shared beyond our portal. For this, we are using the recent W3C Web Annotation Data Model (WADM) supported by the Open Annotation community as it is the most promising model at the moment.
Due to its flexible design and early stage of development, at the moment, there is insufficient recommendations on how some of our user stories and requirements can be modelled. In our presentation we will make proposals on how the WADM can be applied for these scenarios and we are looking for discussion/feedback from the community in the hope that it will help cultural heritage institutions and other communities to better understand how annotations can be modelled.
Exploring comparative evaluation of semantic enrichment tools for cultural he...Hugo Manguinhas
Semantic enrichment of metadata is an important and difficult problem for digital heritage efforts such as Europeana. This presentation gives motivations and presents the work of a recently completed Task Force that addressed the topic of evaluation of semantic enrichment. We especially report on the design and the results of a comparative evaluation experiment, where we have assessed the enrichments of seven tools (or configurations thereof) on a sample benchmark dataset from Europeana.
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using...Hugo Manguinhas
This document discusses linking subject labels from cultural heritage metadata to terms in the MIMO vocabulary using the CultuurLink tool. Several cultural institutions participated in an experiment to align terms from their audio collections to MIMO. They extracted subject vocabularies from the metadata and performed alignments using CultuurLink. The results found that exact string matching identified about 50% of alignments, while more complex strategies discovered additional alignments through techniques like stemming and fuzzy matching. The experiment provided insights into CultuurLink's capabilities and coverage of the MIMO vocabulary for enriching metadata.
Designing a multilingual knowledge graph - DCMI2018Antoine Isaac
Presentation for the paper "Designing a multilingual knowledge graph as service for cultural heritage" at the DCMI2018 conference https://www.dublincore.org/conferences/2018/abstracts/#559
Are you talking to me? Researching a scenario for linking objects and publica...Ellen Van Keer
presentation of the workflow designed in the project "bridging knowledge collections", aimed at integrating an institutional repository with the online catalogs of the museum objects and library publications kept in the RMAH
The document discusses modelling and exchanging annotations for Europeana projects. It proposes adopting the W3C Web Annotation Data Model to represent annotations in RDF using JSON-LD serialization. An Annotations API based on the W3C Web Annotation Protocol allows exchanging annotations between Europeana and platforms like HistoryPin.org and Pundit. Representing metadata annotations is also discussed to make them machine-readable and shareable across interfaces. Overall, modelling annotations interoperably and exchanging them across platforms is still a work in progress.
Semantic Interoperability at Europeana - MultilingualDSIs2018Antoine Isaac
Europeana is a digital platform containing over 58 million digitized cultural heritage objects from 3,700 institutions across 44 countries. The document discusses Europeana's efforts to improve semantic interoperability between these diverse datasets by developing the Europeana Data Model, enriching metadata by linking to external vocabularies, and building an Entity Collection and API to provide centralized access to contextual information about places, people, concepts, and organizations. The goal is to enable richer discovery, exploration, and reuse of Europeana's cultural heritage data on the web.
ArCo, the knowledge graph of the Italian Cultural Heritage: new developmentsArcoProject
Here are a few things you can find out about your Heteropterys chysophylla specimen from ArCo:
- Who originally classified the species (hasDeterminer property)
- If it is designated as a holotype or not (isHolotype property)
- What taxonomic revisions it has undergone, including dates (hasBotanicalRevision property)
- Any taxon it is associated with, and the primary/original taxon would be the first one listed (hasAssociatedTaxon property)
You could query ArCo's SPARQL endpoint to get specific information about your specimen, such as:
PREFIX arco: <https://w3id.org/arco/ontology/>
Ontotext provides concise summaries in 3 sentences or less that provide the high level and essential information from the document.
Ontotext is a software company that works on semantic technologies and knowledge graphs for cultural heritage and digital humanities projects. They have developed applications like ResearchSpace for museums and ConservationSpace for conservation specialists. They also contribute linked open data to projects like Europeana, Wikidata, and DBpedia and provide consulting services to institutions like museums and national aggregators.
The document describes the Old Maps Online project, which aims to create a global search portal for geo-referenced historical maps. It is funded by JISC and involves the University of Portsmouth, Klokan Technologies GmbH, the British Library, and the National Library of Scotland. The portal allows users to intuitively search maps by location and view high-resolution zoomable maps from participating institutions with proper attribution. The project seeks to promote map libraries and make more maps discoverable online.
Hungarian National Digital Archive and Hungarian participation in EuropeanaEuropeanaLocal Project
Monguz Ltd. is a Hungarian company that develops software for libraries and museums. It has over 200 partners in Hungary, Romania, and Slovakia. Monguz participates in Europeana by aggregating content from Hungarian institutions for Europeana. The Hungarian government is launching a new initiative called MaNDA (Hungarian National Digital Archive) to standardize metadata and contribute more Hungarian content to Europeana between 2011-2014. MaNDA will establish new aggregators, digitize additional collections beyond the public domain, and harvest content from neighboring countries.
A Framework for Improved Access to Museum Databases in the Semantic WebMariana Damova, Ph.D
This paper presents a framework for processing Museum databases according to a set of interlinked ontologies, including CIDOC-CRM, and loading them in a reason-able view of the web of data, providing additional links to datasets from the LOD cloud. The infrastructure allows accessing the data via SPARQL queries and to verbalize the query results in natural language, the GF formalism, which allows access to 18 natural languages.
1) The document discusses a vision for linking CRIS (Current Research Information Systems) and OAR (Open Access Repositories) data as Linked Data to improve interoperability in scholarly communication.
2) It proposes using common vocabularies and assigning persistent URIs to entities like publications, authors, projects and organizations to reduce duplication and improve data sharing across domains.
3) Next steps include adopting the KE CRIS-OAR data model and vocabulary and exploring how Linked Data approaches could be used to link publications, research data, and CRIS and OAR domains as part of the OpenAIREplus project.
Vladimir Alexiev presented ResearchSpace, a virtual research environment (VRE) based on the CIDOC CRM ontology. ResearchSpace aims to provide tools and services to support collaborative research projects for cultural heritage scholars. It aggregates data from various sources using semantic technologies and the CIDOC CRM ontology, allows semantic search of the data based on fundamental relations, and includes features for data analysis, collaboration, and web publication. The presentation provided an overview of Ontotext, the company developing ResearchSpace, described some of ResearchSpace's key capabilities, and discussed how the CIDOC CRM is central to ResearchSpace's approach.
Linked Open Europeana aims to leverage Europe's cultural heritage through linked open data. It presents Europeana, an online library of European culture. Europeana started in 2005 and now contains over 15 million objects. Linked open data extends the web by adding semantics through RDF triples and linking disparate data sources. Europeana's data model EDM aims to publish cultural heritage data as linked open data to enable novel uses like knowledge generation by combining data with the rest of the semantic web. For Europeana to fully realize its potential, its data needs to be openly available as linked open data under an open license.
This document summarizes Corey Harper's presentation on metadata and linked data. It discusses publishing and consuming linked bibliographic and cultural heritage data from libraries, archives, and museums. Examples are given of projects linking data from different institutions to provide richer context and enable new discovery experiences for users. Emerging roles for metadata experts are described, including curating linked open data on the web and collaborating with developers.
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...Pedro Príncipe
This document discusses the power of repositories as infrastructure for open science. It notes that individual repositories have value for their institutions, but that their true value lies in their potential for interconnection to create a unified network providing access to research results. This network requires open access content and interoperability between repositories. OpenAIRE is presented as working to realize this potential through services that support content enrichment, notifications to repositories of relevant research, and usage statistics. Funders are also integrating with OpenAIRE to help monitor open access compliance and the impact of research funding.
Values & Vision - Cloud Sandboxes for BIG Earth Sciencesterradue
Terradue is an Italian SME focused on providing cloud services for earth science research. They have developed an open platform to help scientists access and analyze large datasets through web and cloud technologies. Their goal is to stimulate new scientific applications and help researchers adapt to increasing data volumes. The platform allows scientists to share data access points, processing chains, and collaborate across distributed systems delivered as a service. Terradue is focusing on new services like data and software as a service to create marketplaces and leverage linked open data. They are also exploring how to use analytics and human resources like data scientists to help optimize the platform.
This slide deck has been prepared for a workshop on Linked Data Publishing and Semantic Processing using the Redlink platform (http://redlink.co). The workshop delivered at the Department of Information Engineering, Computer Science and Mathematics at Università degli Studi dell'Aquila aimed at providing a general understanding of Semantic Web Technologies and how these can be used in real world use cases such as Salzburgerland Tourismus.
A brief introduction has been also included on MICO (Media in Context) a European Union part-funded research project to provide cross-media analysis solutions for online multimedia producers.
LoCloud: Local Cultural Heritage Online and in the Cloudlocloud
LoCloud's goal is to make content from small cultural institutions accessible online through Europeana. It has developed services like LoCloud Collections and aggregation tools to help institutions digitize and publish their collections. LoCloud Collections allows institutions to set up online catalogs and websites for their collections using open source software. Aggregation tools allow metadata to be harvested and enriched before being added to Europeana. LoCloud has also made microservices and APIs available to developers to create new applications using cultural data.
The British Library, London: Old Maps OnlinePetr Pridal
Petr Pridal is the managing director of Klokan Technologies GmbH, a small Swiss company that develops geospatial applications for cultural heritage institutions. Klokan Technologies created Old Maps Online, a portal that allows users to search historical maps from libraries around the world by geographic location. The project aims to make these maps more accessible online and promote the map collections of participating institutions. Klokan Technologies is seeking help from institutions to contribute additional maps to the portal and provide feedback to further improve the technology.
The document summarizes the SESAM4 project, which aims to lower barriers for small and medium companies to exploit semantic systems. The project developed open source software, best practices, and tools based on semantic technologies and linked open data. It had 10 partners and was funded for 3 years to work on topics like ontology development, content management integration, and demonstrator applications in tourism.
This document provides an overview of linked open data (LOD), ontologies, and cultural heritage projects and datasets. It begins with definitions of semantic technologies, ontologies, and LOD. It then discusses several key ontologies used for cultural heritage data, including CIDOC CRM, Schema.org, and Wikidata. The document outlines several LOD projects focused on cultural heritage, including Europeana, ResearchSpace, American Art Collaborative, and ConservationSpace. It provides examples of semantic searches and annotations using these ontologies and projects. In summary, the document introduces the concepts of LOD and relevant ontologies, and highlights several major cultural heritage projects that apply these semantic technologies.
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...DeVonne Parks, CEM
This document discusses Europeana's use of semantic web technologies and linked data to improve access to cultural heritage collections. It summarizes that Europeana aggregates metadata from various cultural institutions to provide access to over 48 million digitized objects. It has implemented the Europeana Data Model to represent metadata in a more granular, semantically linked way using vocabularies like GeoNames, DBpedia, and AAT. This has enabled automatic enrichment of metadata as well as multilingual and conceptual searching. Linked open data approaches provide technical and strategic benefits to Europeana by facilitating data sharing and enrichment across domains.
The document discusses several projects related to open metadata and linked data including:
1. The AIM25 project which aggregates archive descriptions from 123 partners and aims to test the value of linked data.
2. The COMET project which is releasing a large subset of bibliographic records under an open license and working to convert them to linked open data.
3. The Jerome project which harvests and unifies data from several library systems, supplements it with open data, and provides fast search APIs.
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data21Style
MuseoTorino, is the first italian project using Web 3.0 tecnologies. NOSQL-GraphDB (Neo4J), RDFa, Linked Open Data.
MuseoTorino is a 21style (www.21-style.com) project for the municipality of Torino, Italy.
These slides come from CodeMotion, the best Italian conference for developers and IT entusiast !
The document discusses modelling and exchanging annotations for Europeana projects. It proposes adopting the W3C Web Annotation Data Model to represent annotations in RDF using JSON-LD serialization. An Annotations API based on the W3C Web Annotation Protocol allows exchanging annotations between Europeana and platforms like HistoryPin.org and Pundit. Representing metadata annotations is also discussed to make them machine-readable and shareable across interfaces. Overall, modelling annotations interoperably and exchanging them across platforms is still a work in progress.
Semantic Interoperability at Europeana - MultilingualDSIs2018Antoine Isaac
Europeana is a digital platform containing over 58 million digitized cultural heritage objects from 3,700 institutions across 44 countries. The document discusses Europeana's efforts to improve semantic interoperability between these diverse datasets by developing the Europeana Data Model, enriching metadata by linking to external vocabularies, and building an Entity Collection and API to provide centralized access to contextual information about places, people, concepts, and organizations. The goal is to enable richer discovery, exploration, and reuse of Europeana's cultural heritage data on the web.
ArCo, the knowledge graph of the Italian Cultural Heritage: new developmentsArcoProject
Here are a few things you can find out about your Heteropterys chysophylla specimen from ArCo:
- Who originally classified the species (hasDeterminer property)
- If it is designated as a holotype or not (isHolotype property)
- What taxonomic revisions it has undergone, including dates (hasBotanicalRevision property)
- Any taxon it is associated with, and the primary/original taxon would be the first one listed (hasAssociatedTaxon property)
You could query ArCo's SPARQL endpoint to get specific information about your specimen, such as:
PREFIX arco: <https://w3id.org/arco/ontology/>
Ontotext provides concise summaries in 3 sentences or less that provide the high level and essential information from the document.
Ontotext is a software company that works on semantic technologies and knowledge graphs for cultural heritage and digital humanities projects. They have developed applications like ResearchSpace for museums and ConservationSpace for conservation specialists. They also contribute linked open data to projects like Europeana, Wikidata, and DBpedia and provide consulting services to institutions like museums and national aggregators.
The document describes the Old Maps Online project, which aims to create a global search portal for geo-referenced historical maps. It is funded by JISC and involves the University of Portsmouth, Klokan Technologies GmbH, the British Library, and the National Library of Scotland. The portal allows users to intuitively search maps by location and view high-resolution zoomable maps from participating institutions with proper attribution. The project seeks to promote map libraries and make more maps discoverable online.
Hungarian National Digital Archive and Hungarian participation in EuropeanaEuropeanaLocal Project
Monguz Ltd. is a Hungarian company that develops software for libraries and museums. It has over 200 partners in Hungary, Romania, and Slovakia. Monguz participates in Europeana by aggregating content from Hungarian institutions for Europeana. The Hungarian government is launching a new initiative called MaNDA (Hungarian National Digital Archive) to standardize metadata and contribute more Hungarian content to Europeana between 2011-2014. MaNDA will establish new aggregators, digitize additional collections beyond the public domain, and harvest content from neighboring countries.
A Framework for Improved Access to Museum Databases in the Semantic WebMariana Damova, Ph.D
This paper presents a framework for processing Museum databases according to a set of interlinked ontologies, including CIDOC-CRM, and loading them in a reason-able view of the web of data, providing additional links to datasets from the LOD cloud. The infrastructure allows accessing the data via SPARQL queries and to verbalize the query results in natural language, the GF formalism, which allows access to 18 natural languages.
1) The document discusses a vision for linking CRIS (Current Research Information Systems) and OAR (Open Access Repositories) data as Linked Data to improve interoperability in scholarly communication.
2) It proposes using common vocabularies and assigning persistent URIs to entities like publications, authors, projects and organizations to reduce duplication and improve data sharing across domains.
3) Next steps include adopting the KE CRIS-OAR data model and vocabulary and exploring how Linked Data approaches could be used to link publications, research data, and CRIS and OAR domains as part of the OpenAIREplus project.
Vladimir Alexiev presented ResearchSpace, a virtual research environment (VRE) based on the CIDOC CRM ontology. ResearchSpace aims to provide tools and services to support collaborative research projects for cultural heritage scholars. It aggregates data from various sources using semantic technologies and the CIDOC CRM ontology, allows semantic search of the data based on fundamental relations, and includes features for data analysis, collaboration, and web publication. The presentation provided an overview of Ontotext, the company developing ResearchSpace, described some of ResearchSpace's key capabilities, and discussed how the CIDOC CRM is central to ResearchSpace's approach.
Linked Open Europeana aims to leverage Europe's cultural heritage through linked open data. It presents Europeana, an online library of European culture. Europeana started in 2005 and now contains over 15 million objects. Linked open data extends the web by adding semantics through RDF triples and linking disparate data sources. Europeana's data model EDM aims to publish cultural heritage data as linked open data to enable novel uses like knowledge generation by combining data with the rest of the semantic web. For Europeana to fully realize its potential, its data needs to be openly available as linked open data under an open license.
This document summarizes Corey Harper's presentation on metadata and linked data. It discusses publishing and consuming linked bibliographic and cultural heritage data from libraries, archives, and museums. Examples are given of projects linking data from different institutions to provide richer context and enable new discovery experiences for users. Emerging roles for metadata experts are described, including curating linked open data on the web and collaborating with developers.
Infraestrutura para a Ciência Aberta na Europa - OpenAIRE: O poder dos reposi...Pedro Príncipe
This document discusses the power of repositories as infrastructure for open science. It notes that individual repositories have value for their institutions, but that their true value lies in their potential for interconnection to create a unified network providing access to research results. This network requires open access content and interoperability between repositories. OpenAIRE is presented as working to realize this potential through services that support content enrichment, notifications to repositories of relevant research, and usage statistics. Funders are also integrating with OpenAIRE to help monitor open access compliance and the impact of research funding.
Values & Vision - Cloud Sandboxes for BIG Earth Sciencesterradue
Terradue is an Italian SME focused on providing cloud services for earth science research. They have developed an open platform to help scientists access and analyze large datasets through web and cloud technologies. Their goal is to stimulate new scientific applications and help researchers adapt to increasing data volumes. The platform allows scientists to share data access points, processing chains, and collaborate across distributed systems delivered as a service. Terradue is focusing on new services like data and software as a service to create marketplaces and leverage linked open data. They are also exploring how to use analytics and human resources like data scientists to help optimize the platform.
This slide deck has been prepared for a workshop on Linked Data Publishing and Semantic Processing using the Redlink platform (http://redlink.co). The workshop delivered at the Department of Information Engineering, Computer Science and Mathematics at Università degli Studi dell'Aquila aimed at providing a general understanding of Semantic Web Technologies and how these can be used in real world use cases such as Salzburgerland Tourismus.
A brief introduction has been also included on MICO (Media in Context) a European Union part-funded research project to provide cross-media analysis solutions for online multimedia producers.
LoCloud: Local Cultural Heritage Online and in the Cloudlocloud
LoCloud's goal is to make content from small cultural institutions accessible online through Europeana. It has developed services like LoCloud Collections and aggregation tools to help institutions digitize and publish their collections. LoCloud Collections allows institutions to set up online catalogs and websites for their collections using open source software. Aggregation tools allow metadata to be harvested and enriched before being added to Europeana. LoCloud has also made microservices and APIs available to developers to create new applications using cultural data.
The British Library, London: Old Maps OnlinePetr Pridal
Petr Pridal is the managing director of Klokan Technologies GmbH, a small Swiss company that develops geospatial applications for cultural heritage institutions. Klokan Technologies created Old Maps Online, a portal that allows users to search historical maps from libraries around the world by geographic location. The project aims to make these maps more accessible online and promote the map collections of participating institutions. Klokan Technologies is seeking help from institutions to contribute additional maps to the portal and provide feedback to further improve the technology.
The document summarizes the SESAM4 project, which aims to lower barriers for small and medium companies to exploit semantic systems. The project developed open source software, best practices, and tools based on semantic technologies and linked open data. It had 10 partners and was funded for 3 years to work on topics like ontology development, content management integration, and demonstrator applications in tourism.
This document provides an overview of linked open data (LOD), ontologies, and cultural heritage projects and datasets. It begins with definitions of semantic technologies, ontologies, and LOD. It then discusses several key ontologies used for cultural heritage data, including CIDOC CRM, Schema.org, and Wikidata. The document outlines several LOD projects focused on cultural heritage, including Europeana, ResearchSpace, American Art Collaborative, and ConservationSpace. It provides examples of semantic searches and annotations using these ontologies and projects. In summary, the document introduces the concepts of LOD and relevant ontologies, and highlights several major cultural heritage projects that apply these semantic technologies.
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...DeVonne Parks, CEM
This document discusses Europeana's use of semantic web technologies and linked data to improve access to cultural heritage collections. It summarizes that Europeana aggregates metadata from various cultural institutions to provide access to over 48 million digitized objects. It has implemented the Europeana Data Model to represent metadata in a more granular, semantically linked way using vocabularies like GeoNames, DBpedia, and AAT. This has enabled automatic enrichment of metadata as well as multilingual and conceptual searching. Linked open data approaches provide technical and strategic benefits to Europeana by facilitating data sharing and enrichment across domains.
The document discusses several projects related to open metadata and linked data including:
1. The AIM25 project which aggregates archive descriptions from 123 partners and aims to test the value of linked data.
2. The COMET project which is releasing a large subset of bibliographic records under an open license and working to convert them to linked open data.
3. The Jerome project which harvests and unifies data from several library systems, supplements it with open data, and provides fast search APIs.
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data21Style
MuseoTorino, is the first italian project using Web 3.0 tecnologies. NOSQL-GraphDB (Neo4J), RDFa, Linked Open Data.
MuseoTorino is a 21style (www.21-style.com) project for the municipality of Torino, Italy.
These slides come from CodeMotion, the best Italian conference for developers and IT entusiast !
The Europeana Cloud project aims to build a shared digital infrastructure for cultural heritage institutions in Europe. The project has 35 partner institutions and will develop tools and services for researchers. It will ingest 2.4 million metadata records and 5 million content items. Work Package 1 focuses on engaging with humanities and social sciences researchers to understand their needs and inform the development of the cloud infrastructure and services. Activities include forming an advisory board, conducting surveys, and holding expert forums to help define a research content strategy and user requirements. The goal is to better support research through aggregation of content in the cloud.
Europeana Cloud Work Package 1: Assessing Researchers' Needs in the CloudTU Delft, Netherlands
A presentation given about Work Package 1 of the Europeana Cloud project http://pro.europeana.eu/web/europeana-cloud
By Agiatis Bernadou and Alastair Dunning
Given at http://dighumlab.dk/news/single-news/artikel/cfp-cultural-heritage-creative-tools-and-archives-workshop/, June 2013
Managing director of Klokan Technologies GmbH, a small Swiss company that develops innovative geo applications for cultural heritage institutions. The document discusses Old Maps Online, a project that provides an easy-to-use gateway for searching historical maps from libraries around the world. It allows users to search maps by geographic location on an interactive world map and view high resolution maps from contributing institutions with proper crediting back to the libraries. The project is open to additional map contributors and uses tools like BoundingBox and Georeferencer to help enrich map metadata.
The Galleries, Libraries, Archives and Museums (GLAM) sector deals with complex and varied data. Integrating that data, especially across institutions, has always been a challenge. Semantic data integration is the best approach to deal with such challenges. Linked Open Data (LOD) enable large-scale Digital Humanities (DH) research, collaboration and aggregation, allowing DH researchers to make connections between (and make sense of) the multitude of digitized Cultural Heritage (CH) available on the web. An upsurge of interest in semtech and LOD has swept the CH and DH communities. An active Linked Open Data for Libraries, Archives and Museums (LODLAM) community exists, CH data is published as LOD, and international collaborations have emerged. The value of LOD is especially high in the GLAM sector, since culture by its very nature is cross-border and interlinked. We present interesting LODLAM projects, datasets, and ontologies, as well as Ontotext's experience in this domain. An extended paper on these topics is also available. It has 77 pages, 67 figures, detailed info about CH content and XML standards, Wikidata and global authority control.
xDams and the Reload Project at "Italian lectures on semantic web and linked ...regesta_com
Le slide di Silvia Mazzini di regesta.exe sui Linked Data in ambito archivistico. Intervento sul progetto Reload e xDams alla giornata di lavoro organizzata dall' American University of Rome il 7maggio 2014. Regesta speech by Silvia Mazzini at American University of Rome workshop: "archival resources into the web of data"
Hopsworks - ExtremeEarth Open WorkshopExtremeEarth
This document summarizes a presentation about the three-year ExtremeEarth project. It discusses the ExtremeEarth platform architecture, which brings together Earth observation data access from DIASes, end-user products from TEPs, and scalable AI capabilities from Hopsworks. The architecture provides infrastructure on Creodias and uses Hopsworks to develop end-to-end machine learning pipelines for processing petabytes of Earth observation data. Results have been exploited through additional research projects and a product offering on Hopsworks.ai. The project has also led to several publications and blog posts about applying AI to Earth observation data.
Similaire à Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden (20)
"Semantic Integration Is What You Do Before The Deep Learning". dev.bg Machine Learning seminar, 13 May 2019.
It's well known that 80\% of the effort of a data scientist is spent on data preparation. Semantic integration is arguably the best way to spend this effort more efficiently and to reuse it between tasks, projects and organizations. Knowledge Graphs (KG) and Linked Open Data (LOD) have become very popular recently. They are used by Google, Amazon, Bing, Samsung, Springer Nature, Microsoft Academic, AirBnb… and any large enterprise that would like to have a holistic (360 degree) view of its business. The Semantic Web (web 3.0) is a way to build a Giant Global Graph, just like the normal web is a Global Web of Documents. IEEE already talks about Big Data Semantics. We review the topic of KGs and their applicability to Machine Learning.
Invited report at Digital Presentation and Preservation of Cultural and Scientific Heritage (DIPP 2018), Burgas, Bulgaria. Report: http://dipp.math.bas.bg/images/2018/019-050_32_11-iDiPP2018-34.pdf
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Vladimir Alexiev, PhD, PMP
The European Holocaust Research Infrastructure (EHRI) is a large-scale EU project that involves 23 institutions and archives working on Holocaust studies, from Europe, Israel and the US. In its first phase (2011-2015) it aggregated archival descriptions and materials on a large scale and built a Virtual Research Environment (portal) for Holocaust researchers based on a graph database.
In its second phase (2015-2019), EHRI2 seeks to enhance the gathered materials using semantic approaches: enrichment, coreferencing, interlinking. Semantic integration involves four of the 14 EHRI2 work packages and helps integrate databases, free text, and metadata to interconnect historical entities (people, organizations, places, historic events) and create networks. We will present some of the EHRI2 technical work, including critical issues we have encountered.
WP10 (EAD) converts archival descriptions from various formats to standard EAD XML; transports EADs using OAI PMH or ResourceSync; ingests EADs to the EHRI database; enables use cases such as synchronization; coreferencing of textual Access Points to proper thesaurus references
WP11 (Authorities and Standards) consolidates and enlarges the EHRI authorities to render the indexing and retrieval of information more effective. It addresses Access Points in ingested EADs (normalization of Unicode, spelling, punctuation; deduplication; clustering; coreferencing to authority control), Subjects (deployment of a Thesaurus Management System in support of the EHRI Thesaurus Editorial Board), Places (coreferencing to Geonames); Camps and Ghettos (integrating data with Wikidata); Persons, Corporate Bodies (using USHMM HSV and VIAF); semantic (conceptual) search including hierarchical query expansion; interconnectivity of archival descriptions; permanent URLs; metadata quality; EAD RelaxNG and Schematron schemas and validation, etc.
WP13 (Data Infrastructures) builds up domain knowledge bases from institutional databases by using deduplication, semantic data integration, semantic text analysis. It provides the foundation for research use cases on Jewish Social Networks and their impact on the chance of survival.
WP14 (Digital Historiography Research) works on semantic text analysis (semantic enrichment), text similarity (e.g. clustering based on Neural Networks, LDA, etc), geo-mapping. It develops Digital Historiography researcher tools, including Prosopographical approaches.
How GLAMs can use Wikipedia/Wikidata to make their collections globally accessible across languages.
Europeana Food and Drink content providers workshop, Athens, 18 May 2015
Wikidata is being considered as a target for Europeana's semantic strategy to enrich metadata and link cultural heritage objects. Wikidata provides multilingual information on entities like people, organizations, places and concepts from Wikipedia. It has potential to help solve Europeana's challenges around data diversity, quality and interlinking due to its connections to other vocabularies and ability to perform automatic enrichment. Europeana aims to semantically enrich its data and link objects to Wikidata entities to provide additional context. Content providers are also encouraged to contribute information to Wikidata to further enhance the semantic knowledge graph.
Europeana Food and Drink annual meeting, 20 Mar 2015, Athens, Greece. Full report: http://vladimiralexiev.github.io/pubs/Europeana-Food-and-Drink-Classification-Scheme-(D2.2).pdf
Information School, University of Washington, 2014-05-21: INFX 598 - Introducing Linked Data: concepts, methods and tools. Guest lecture (Module 9) "Doing Business with Semantic Technologies": Introduction to Ontotext and some of its products, clients and projects.
Also see video:https://voicethread.com/myvoice/#thread/5784646/29625471/31274564
Triplestores and inference, applications in Finance, text-mining. Projects and solutions for financial media and publishers.
Keystone Industrial Panel, ISWC 2014, Riva del Garda, 18 Oct 2014.
Thanks to Atanas Kiryakov for this presentation, I just cut it to size.
Full version of http://www.slideshare.net/valexiev1/gvp-lodcidocshort. Same is available on http://vladimiralexiev.github.io/pres/20140905-CIDOC-GVP/index.html
CIDOC Congress, Dresden, Germany
2014-09-05: International Terminology Working Group: full version.
2014-09-09: Getty special session: short version
CIDOC Congress, Dresden, Germany
2014-09-05: International Terminology Working Group: full version (http://vladimiralexiev.github.io/pres/20140905-CIDOC-GVP/index.html)
2014-09-09: Getty special session: short version (http://VladimirAlexiev.github.io/pres/20140905-CIDOC-GVP/GVP-LOD-CIDOC-short.pdf)
This document discusses semantic technologies for cultural heritage. It introduces Ontotext Corp, which develops semantic technology, and some of their projects involving cultural heritage data. These include the ResearchSpace project with the British Museum, projects involving Europeana like Bulgariana and Europeana Creative, and publishing Getty vocabularies as linked open data.
Nikola Ikonomov, Boyan Simeonov, Jana Parvanova and Vladimir Alexiev. In Digital Presentation and Preservation of Cultural and Scientific Heritage (DiPP 2013), Veliko Tarnovo, Bulgaria, Sep 2013.
Nikola Ikonomov, Boyan Simeonov, Jana Parvanova and Vladimir Alexiev. In Digital Presentation and Preservation of Cultural and Scientific Heritage (DiPP 2013), Veliko Tarnovo, Bulgaria, Sep 2013
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...Vladimir Alexiev, PhD, PMP
Vladimir Alexiev, Dimitar Manov, Jana Parvanova and Svetoslav Petrov. In proceedings of workshop Practical Experiences with CIDOC CRM and its Extensions (CRMEX 2013) at TPDL 2013, 26 Sep 2013, Valetta, Malta
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...Vladimir Alexiev, PhD, PMP
Vladimir Alexiev, Dimitar Manov, Jana Parvanova and Svetoslav Petrov. In proceedings of workshop Practical Experiences with CIDOC CRM and its Extensions (CRMEX 2013) at TPDL 2013, 26 Sep 2013, Valetta, Malta
Jana Parvanova, Vladimir Alexiev and Stanislav Kostadinov. In workshop Collaborative Annotations in Shared Environments: metadata, vocabularies and techniques in the Digital Humanities (DH-CASE 2013). Collocated with DocEng 2013. Florence, Italy, Sep 2013.
Jana Parvanova, Vladimir Alexiev and Stanislav Kostadinov. In workshop Collaborative Annotations in Shared Environments: metadata, vocabularies and techniques in the Digital Humanities (DH-CASE 2013). Collocated with DocEng 2013. Florence, Italy, Sep 2013.
What to do when you have a perfect model for your software but you are constrained by an imperfect business model?
This talk explores the challenges of bringing modelling rigour to the business and strategy levels, and talking to your non-technical counterparts in the process.
Malibou Pitch Deck For Its €3M Seed Roundsjcobrien
French start-up Malibou raised a €3 million Seed Round to develop its payroll and human resources
management platform for VSEs and SMEs. The financing round was led by investors Breega, Y Combinator, and FCVC.
Measures in SQL (SIGMOD 2024, Santiago, Chile)Julian Hyde
SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries.
SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL.
To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context.
A talk at SIGMOD, June 9–15, 2024, Santiago, Chile
Authors: Julian Hyde (Google) and John Fremlin (Google)
https://doi.org/10.1145/3626246.3653374
Unveiling the Advantages of Agile Software Development.pdfbrainerhub1
Learn about Agile Software Development's advantages. Simplify your workflow to spur quicker innovation. Jump right in! We have also discussed the advantages.
How Can Hiring A Mobile App Development Company Help Your Business Grow?ToXSL Technologies
ToXSL Technologies is an award-winning Mobile App Development Company in Dubai that helps businesses reshape their digital possibilities with custom app services. As a top app development company in Dubai, we offer highly engaging iOS & Android app solutions. https://rb.gy/necdnt
Transform Your Communication with Cloud-Based IVR SolutionsTheSMSPoint
Discover the power of Cloud-Based IVR Solutions to streamline communication processes. Embrace scalability and cost-efficiency while enhancing customer experiences with features like automated call routing and voice recognition. Accessible from anywhere, these solutions integrate seamlessly with existing systems, providing real-time analytics for continuous improvement. Revolutionize your communication strategy today with Cloud-Based IVR Solutions. Learn more at: https://thesmspoint.com/channel/cloud-telephony
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
SMS API Integration in Saudi Arabia| Best SMS API ServiceYara Milbes
Discover the benefits and implementation of SMS API integration in the UAE and Middle East. This comprehensive guide covers the importance of SMS messaging APIs, the advantages of bulk SMS APIs, and real-world case studies. Learn how CEQUENS, a leader in communication solutions, can help your business enhance customer engagement and streamline operations with innovative CPaaS, reliable SMS APIs, and omnichannel solutions, including WhatsApp Business. Perfect for businesses seeking to optimize their communication strategies in the digital age.
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
Hand Rolled Applicative User ValidationCode KataPhilip Schwarz
Could you use a simple piece of Scala validation code (granted, a very simplistic one too!) that you can rewrite, now and again, to refresh your basic understanding of Applicative operators <*>, <*, *>?
The goal is not to write perfect code showcasing validation, but rather, to provide a small, rough-and ready exercise to reinforce your muscle-memory.
Despite its grandiose-sounding title, this deck consists of just three slides showing the Scala 3 code to be rewritten whenever the details of the operators begin to fade away.
The code is my rough and ready translation of a Haskell user-validation program found in a book called Finding Success (and Failure) in Haskell - Fall in love with applicative functors.
Top 9 Trends in Cybersecurity for 2024.pptxdevvsandy
Security and risk management (SRM) leaders face disruptions on technological, organizational, and human fronts. Preparation and pragmatic execution are key for dealing with these disruptions and providing the right cybersecurity program.
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
1. SEMANTIC TECHNOLOGIES FOR CULTURAL
HERITAGE
VLADIMIR.ALEXIEV@ONTOTEXT.COM
2014-08-21, MALMO, SWEDEN
2D interactive version, pdf, slideshare.
Press O for overview, H for help.
Proudly made in plain text with , ,
reveal.js org-reveal org-mode
emacs
and .
2. TABLE OF CONTENTS
Semantic Technologies
Ontotext Corp
Some Ontotext Products
Ontotext GLAM Projects
Europeana Creative
Europeana Food and Drink
Bulgariana
Ontotext / GraphDB in CH
ResearchSpace
Getty Vocabularies LOD
Possible Future Topics
3. SEMANTIC TECHNOLOGIES
Web 1.0: hyperlinked documents (World Wide Web)
Web 2.0: interactive applications, the Social Web
Web 3.0: interlinked data (Global Giant Graph)
Is this something new?
It was all envisioned by Sir Tim Berners-Lee 25 years ago
Standardized by W3C: both HTML and sem web standards
(RDF, RDFS, OWL, SPARQL…)
Great flurry of sem tech activity in the last 15 years
Buzzwords: Big Data, Semantic Analytics, Concept Extraction,
Sentiment Analysis…
4. LINKED OPEN DATA CLOUD
Parts at factforge.net , linkedlifedata.com
. Grown 10x since
Mar 2009!
10. ONTOTEXT CORP
World Leading semantic technology developer
Working in this area since 2000 as part of Sirma Group
Spun off in 2008 after venture investment (NEVEQ)
75 employees: Bulgaria (Sofia and Varna), UK, USA
(Washington DC)
Global leader in semantic databases, semantic annotation
and search
Proven Delivery
Highest profile sem web applications
BBC: World Cup 2010, London Olympics 2012, all of BBC
sport…
Dynamic Semantic Publishing: Master Publishing platform
Semantic search for multinational pharmaceuticals (eg
Astra Zeneca)
Stable and Growing, both staff and revenue
13. CURRENT RESEARCH PROJECTS
: Educational Curriculum for the usage of Linked Data
EUCLID
Professional training curriculum for data practitioners
aiming to use Linked Data in their daily work.
Strongly relevant to CH metadata specialists and other
experts focusing on Linked Open Data
: Cloud-Based Text Annotation Marketplace
AnnoMarket
Open marketplace for pay-as-you-go, cloud-based
extraction resources and services
Multilingual semantic entity extraction from CH text (e.g.
museum object descriptions) is important and largely
unsolved
: Linked Data Benchmark Council
LDBC
NPO for publishing and auditing benchmark results for
graph and RDF databases.
CH institutions that decide to use repositories require such
info, and can provide meaningful use cases
14. CURRENT RESEARCH PROJECTS (2)
: Re-use of cultural heritage metadata and
Europeana Creative
content by the creative industries.
Contribution to improving the usefulness and kick-starting
the professional use of Europeana data
Ontotext plays a core technological role, helping to fulfill 3
Europeana technical KPIs
: explore and celebrate European
Europeana Food and Drink
cultural identity through its culinary and social history
Ontotext works on culinary culture classification scheme,
semantic representation and storage, semantic text
analysis, and semantic application
15. CURRENT RESEARCH PROJECTS (3)
: Multidimensional content integration
MultiSensor
Mine heterogeneous content using multilingual
technologies with sentiment, social and spatiotemporal
competence
Application of Linguistic Linked Data
Relevant to text and multimedia CH content
: Data Publishing through the Cloud
DaPaaS
Data- and Platform-as-a-Service Approach for Efficient
Data Publication and Consumption
Useful for converting and hosting your Linked Open Data,
and implementing Open Data Portals
: Computing Veracity Across Media, Languages, and
Pheme
Social Networks
16. SOME ONTOTEXT PRODUCTS
GraphDB (OWLIM)
KIM Semantic Annotation
Master Publishing Platform
PROTON Ontology
17. GRAPHDB (OWLIM)
High-performance semantic repository created by Ontotext
Reasoning and query evaluation are performed over a
persistent storage layer.
Loading, reasoning and query evaluation are fast even against
complex ontologies and huge knowledge bases
Can manage billions of statements on desktop hardware, 10s
of billions on commodity server hardware
Pure Java implementation, ensuring ease of deployment and
portability
Compatible with Sesame (OpenRDF), which brings
interoperability benefits and support for all major RDF
syntaxes and query languages
Compatible with Jena through a built in adapter layer
Enterprise-grade
Used by important commercial clients (see slide above)
Found a great following in the CH domain (see later)
18. TODO GRAPHDB FEATURES
High-performance reasoning: RDFS, OWL-Horst, OWL2 RL,
QL
Custom rule-sets allow tuning for optimal performance and
expressivity
Optimized owl:sameAs handling: dramatic improvements for
data integrated from multiple sources
Clustering: resilience, fail-over and scalable parallel query
processing
Geo-spatial extensions for fast geo queries over WGS84 data
Full-text search support, based on either Lucene or
proprietary search techniques
High-performance retraction of statements & inferences
Expressive consistency & integrity constraint checking
mechanisms
Notification mechanism, to allow clients to react to statements
in the update stream
19. NEW GRAPHDB FEATURES
GraphDB-Workbench with improved management
JMX-based management and control interfaces
Cluster deployment and testing tool
Cluster operational improvements
Explain Query Plans
Rule profiling
Support for external plug-ins. Loaded from the classpath,
handle custom functions & predicates
Connectors that synchronize RDF data to provide extremely
fast full-text and facet searches:
Elasticsearch GraphDB Connector
Lucene GraphDB Connector
Solr GraphDB Connector
20. KIM SEMANTIC ANNOTATION AND SEARCH
Built on top of GATE
Ontotext is the largest commercial contributor to GATE
Used by important commercial clients: BBC, UK Press
Association, NDP, Oxford University Press, Financial Times,
Euromoney…
Large-scale semantic annotation based on:
Assembling a semantic knowledge base of a domain
Creating annotation guidelines and a Gold Standard Corpus
Machine learning
Involves:
Named Entity Recognition
Semantic Disambiguation
Concept Extraction
Relation Extraction
Event Extraction
27. EUROPEANA CREATIVE
Enabling Creatives to Work with CH Data
Pilots by eCreative partners
Open challenges, growing to incubation support
Help with collection data, content reuse, Europeana APIs,
creative workshop ideas…
28. IN 5 PILOT AREAS: TOURISM, SOCIAL NETWORKS, DESIGN,
NATURE, HISTORY
29. ONTOTEXT IN EUROPEANA / EUROPEANA CREATIVE
Ontotext works on fundamental backend technologies
important for tech KPIs
30. EUROPEANA OAI AND SPARQL
Ontotext creates OAI PMH server for Europeana
So we or others can download objects in bulk
Ontotext hosts the Europeana semantic data (EDM) in OWLIM
http://europeana.ontotext.com/sparql
20M objects, obsolete: 1.5 years old
http://europeana-test.ontotext.com/sparql
20M objects, incomplete: working with Europeana to update it
Provides SPARQL querying
31. SPARQL 1.1 QUERIES
Eg Polish Periodicals by library and decade
http://europeana-test.ontotext.com/sparql
select
?date
(sum(?n1) as ?Uniwersytetu_Warszawskiego)
(sum(?n2) as ?Politechniki_Lubelskiej)
(sum(?n3) as ?Baltycka)
{
?x dc:type 'periodical'@en.
?x ore:proxyIn/edm:dataProvider ?dataProvider.
?x dc:date ?date2.
bind (xsd:integer(concat(substr(?date2,1,3),'0')) as ?date)
bind (if(?dataProvider='e-biblioteka Uniwersytetu Warszawskiego',1,0) as ?n1
)
bind (if(?dataProvider='Biblioteka Cyfrowa Politechniki Lubelskiej',1,0) as
?n2)
bind (if(?dataProvider='Bałtycka Biblioteka Cyfrowa',1,0) as ?n3)
} group by ?date order by ?date
32. SPARQL ANALYTICS
Eg Polish Periodicals by library & decade (you can jsfiddle with it)
34. EUROPEANA FOOD AND DRINK
Europeana Food and Drink:
Explore and celebrate European cultural identity through its
culinary and social history
29 partners, of which perhaps 20 are content providers
Ontotext works on:
culinary culture classification scheme
semantic representation and storage
semantic text analysis
semantic application (pilot)
35. EDAMAM RECIPE/FOOD KNOWLEDGE BASE
Crawled 1.5M recipes, extracted ingredients, matched to SR23
enabling semantic search
39. RHYTON AT EUROPEANA OPEN CULTURE
Others make beautiful apps with your data! Bulgariana
Collection Featured in Open Culture
40. ONTOTEXT / GRAPHDB IN CH
Ontotext helped create some of the significant CH LOD datasets,
hosted on GraphDB:
British Museum (CRM):
http://collection.britishmuseum.org/sparql
PSNC Polish Digital Library (CRM/FRBRoo):
Europeana (EDM):
Getty AAT & TGN (SKOS, SKOS-XL…):
JP LOD.AC: Japanese LOD initiative:
FP7 CHARISMA: art database portal:
http://dl.psnc.pl
http://europeana.ontotext.com
http://vocab.getty.edu
http://lod.ac
http://archives-charisma-
portal.eu/
FP7 3D COFORM: architectural and archaeological objects
ConservationSpace: system for conservation specialists
Comparing to:
FactForge (9 general LOD):
LinkedLifeData (13 bio LOD):
http://www.factforge.net
http://linkedlifedata.com
41. GRAPHDB REPO SIZES
Millions: objects, explicit statements, ex.st per object, total
statements; expansion ratio
Repo Ontology Obj Ex.st Ex.st/obj Tot.st Exp. Nodes Density Reasoning
BM CRM 2.0 195 90 916 4.7 54 17.0 rdfs+tran+PSNC CRM/FRBRoo 3.1 234 75 535 2.3 60 8.9 rdfs-subClass
Europeana EDM 20.3 998 50 3798 3.8 266 14.3 owl-horst
Getty SKOS etc 1.3 103 79 163 1.6 28 5.8 owl-horst
FF DC, DBP 1673 3211 1.9 456 7.0 owl-horst
LLD 6706 10192 1.5 1554 6.6 rdfs+trans
References (Partial):
Large-scale Reasoning with a Complex Cultural Heritage
Ontology (CIDOC CRM), CRMEX 2013
OWLIM Reasoning over FactForge, ORE 2012
Transforming a Flat Metadata Schema to a Semantic Web
Ontology: The Polish Digital Libraries Federation and CIDOC
CRM Case Study. Studies in Computational Intelligence 2012
43. RESEARCHSPACE
A Virtual Research Environment for art research
Funded by the Andrew Mellon Foundation
Executed by the British Museum
Software developed by Ontotext
Uses Ontotext's semantic database (GraphDB)
Papers:
Types and annotations for CIDOC CRM properties, DiPP 2012
Implementing CIDOC CRM search based on fundamental
relations and OWLIM rules, SDA 2012
Large-scale Reasoning with a Complex Cultural Heritage
Ontology (CIDOC CRM), CRMEX 2013
RDF data and image annotations in ResearchSpace, DH-CASE
2013
44. RESEARCHSPACE PRESENTATIONS AND VIDEOS
website
ResearchSpace
News Files
& , including presentations & papers
Videos by Dominic Oldman
(British Museum)
Presentations by Barry Norton
(BM, former Ontotext)
For example:
Oxford Summer School slides
, Jul 2014
GLAMorous LOD and ResearchSpace introduction
,
Rijksmuseum, May 2014
GLAMorous LOD
, NGA, Washington DC, Apr 2014
ResearchSpace, CIDOC CRM and Ethical data
, UC London
ResearchSpace CIDOC CRM Search System
, Apr 2013
CIDOC CRM Cultural Semantic Search using Fundamental
Relationships
, May 2013
Book of the Dead Project
: using CIDOC-CRM, FRBRoo and
RDFa
Querying Cultural Heritage Data
45. 2M BRITISH MUSEUM OBJECTS AS LOD
Eg http://collection.britishmuseum.org/id/object/EOC3130
48. RESEARCHSPACE: SEMANTIC IMAGE ANNOTATION
Allows arbitrary shapes using SvgEdit, supports deep zoom,
relates to semantic facts, or free discussion
49. CRM SEARCH (FUNDAMENTAL RELATIONS)
CRM data comprises complex graphs of nodes and properties.
How can a user search through such complex graphs?
The number of possible combinations is staggering
FC/FR Approach:
New Framework for Querying Semantic Networks (
TR419, 2011
)
FORTH
Fundamental Categories and Relationships for intuitive
querying CIDOC-CRM based repositories (
Apr 2012
, 153 pages)
FORTH TR-429,
"Compresses" the semantic network by mapping networks of
CRM properties to single FRs
FRs serve as a "search index" over the CRM semantic web
Allow the user to use a simpler query vocabulary
51. EXAMPLE: THING FROM PLACE
How a Thing's origin can be related to Place (* = recursion)
Thing (part of another)* considered to be "from" Place if:
is formerly or currently located at Place (falling in another)*
or was brought into existence (produced/created) by an Event
(part of another)*
that happened at Place (falling in another)*
or was carried out by an Actor (who is member of a Group)*
who formerly or currently has residence at Place (falling
in another)*
or was brought into existence (born/formed) by an Event
(part of another)* that happened at Place (falling in
another)*
or was Moved to/from a Place (falling in another)*
or changed ownership through an Acquisition (part of
another)*
that happened at Place (falling in another)*
54. THING FROM PLACE: SPARQL QUERY
select ?t ?p2 {
?t a FC70_Thing. ?t (P46i_forms_part_of* | P106i_forms_part_of* | P148i_is_com
ponent_of*) ?t1.
{?t1 (P53_has_former_or_current_location | P54_has_current_permanent_locatio
n) ?p1}
UNION
{?t1 P92i_was_brought_into_existence_by ?e1. ?e1 P9i_forms_part_of* ?e2.
{?e2 P7_took_place_at ?p1}
UNION
{?e2 P14_carried_out_by ?a1.
?a1 P107i_is_current_or_former_member_of* ?a2.
{?a2 P74_has_current_or_former_residence ?p1}
UNION
{?a2 P92i_was_brought_into_existence_by ?e3. ?e3 P9i_forms_part_of*
?e4.
?e4 P7_took_place_at ?p1}}}
UNION
{?t2 P25i_moved_by ?e5. ?e5 (P26_moved_to | P27_moved_from) ?p1}
UNION
{?t2 P24i_changed_ownership_through ?e6.
?e6 P9i_forms_part_of ?e7. ?e7 P7_took_place_at ?p1}.
?p1 P89_falls_within* ?p2}
Very complex and expensive, especially when you need to
combine with other FRs into composite queries
Tried in 3D COFORM, just doesn't work
56. THING FROM PLACE: DECOMPOSING INTO SUB-FRS
"Sub-FRs" are auxiliary relations used to build up the final FR
The numbering comes from CRM property and entity names
Prefixes: FR: final result, FRT: transitive, FRX: non-transitive,
FC70=Thing or E: from/to that class
# self-loops and simple disjunctions
FRT_46i_106i_148i := (P46i|P106i|P148i)+
FRT_9i_10 := (P9|P10)+
FRT_107i := P107i+
FRT_89 := P89+
FRX_53_54 := (P53|P54)
FRX_24i_25i := (P24i|P25i)
# growing fragments
FRX_92i := P92i | P92i/FRT_9i_10
FRX_92i_14 := FRX_92i/P14 | FRX_92i/P14/FRT_107i
FRX_FC70_E8_9_63 := FRX_92i_14/P92i | FRX_24i_25i
FRX_FC70_E8_9_63_P7 := FRX_FC70_E8_9_63/P7 | FRX_FC70_E8_9_63/FRT_9i_10/P7
FRX7 := FRX_53_54 | FRX_FC70_E8_9_63_P7 | FRX_92i_14/P74 | FRX_92i/P7
FRX7_P89 := FRX7 | FRX7/FRT_89
FR7 := FRX7_P89 | FRT_46i_106i_148i/FRX7_P89
57. FR IMPLEMENTATION AS OWLIM RULES
OWL2 doesn't have conjunctive properties
So we implemented with OWLIM rules, using the
parallel/sequential decompositions above
Details:
FR Implementation
Implemented 19 FRs of Thing (see FR Names
):
refers to or is about Place; from Place; is/was located in
Place
has met Actor; by Actor
refers to or is about Event; has met Event
is made of Material; is/has Type; used technique; identified
by Identifier
Use 44 CRM properties. Took 86 rules, 10 axioms, 26 sub-FRs
(gray on next slide)
Refactoring idea:
http://vladimiralexiev.github.io/pres/extending-owl2/
index.html
59. GETTY VOCABULARIES LOD
Well-known and important cultural heritage thesauri:
Art and Architecture Thesaurus (AAT)
Thesaurus of Geographic Names (TGN)
Unified List of Artist Names (ULAN)
Cultural Object Names Authority (CONA)
Ontotext helps Getty publish them as LOD:
http://vocab.getty.edu
AAT published Feb 2014, already sees numerous use cases
TGN published Aug 2014
Continuing with ULAN, CONA; AATA (bibliography), Getty
Museum data
Special session at CIDOC Congress (Dresden, Sep 2014)
60. GETTY EXTERNAL ONTOLOGIES
SKOS, SKOSXL, ISO 25964 for representing thesaurus info;
DC, DCT for common properties;
BIBO, FOAF for sources and contributors;
WGS, Schema for geographic information;
PROV for revision history;
RDF, RDFS, OWL, XSD for system properties;
R2RML for implementing the conversion.
64. USE OF ISO 25946 IN GETTY LOD
Latest standard on thesauri: ISO 25946. Use Thesaurus Array for
ordered children
65. CONTRIBUTION TO ISO 25946
Contributed to ISO 25946 ontology:
http://purl.org/iso25964/skos-thes
See
Linked Open Vocabularies entry
First industrial use of ISO 25946
Defined appropriate combinations of BTG, BTP, BTI relations
(first formally defined in ISO).
On Compositionality of ISO 25964 Hierarchical Relations (BTG,
BTP, BTI) , V.Alexiev, J.Lindenthal, A.Isaac.
Networked
Knowledge Organization Systems (NKOS 2014)
Workshop at
DL2014, London, 11-12 Sep 2014
67. TGN CHARTING WITH SPARQL
Number of members of the UN per year. See doc or jsfiddle
with
it
68. POSSIBLE FUTURE TOPICS
Deploying thesaurus management system (VocBench) based
on SKOS, SKOS-XL and semantic repository
Text analytics and semantic annotation of CH records
Linguistic Linked Data
Manuscripts: semantic integration, semantic search, semantic
annotation
Research Infrastructures
0