1. TEAM 2.5: Publication of linked data from pan-
European geospatial datasets
INSPIRE Conference, Kehl, 5th Sept 2017
2. TEAM MEMBERS
• Dr. Raul Palma : coordinator of Semantic
Technologies in the Network Services division at
Poznan Supercomputing and Networking Center
(PSNC), Poland.
• Soumya Brahma: data analyst in the Network
Services division at PSNC Technical support, use
cases analysis and monitoring, data analysis and
integration.
• Dmitrij Kozuch
• Raitis Berzins
• Acknowledments to Dr. Peter Haase, and Dr.
Johannes Trame (Metaphacts)
INSPIRE Conference, Kehl, 5th Sept 2017
3. SUPPORTING PROJECTS
• Databio - Data-Driven Bioeconomy
The main goal of the DataBio project is to show the benefits
of Big Data technologies in the raw material production from
agriculture, forestry and fishery for the bioeconomy industry
to produce food, energy and biomaterials responsibly and
sustainably.
DataBio proposes to deploy a state of the art, big data
platform on top of the existing partners’ infrastructure and
solutions.
• One of the main tasks of DataBio relates to Big Data Variety
Management, Storage, Linked Data and Queries
INSPIRE Conference, Kehl, 5th Sept 2017
H2020 - ICT-15-2016-2017; Big Data PPP: Large Scale Pilot actions in sectors best benefitting from data-driven innovation.
4. INPUT DATA
• Open Land Use (open dataset): is a composite dataset intended to create detailed land-
use maps of various regions based on certain pan-Europen datasets such as CORINE
Landcover, UrbanAtlas enriched by available regional
(http://sdi4apps.eu/open_land_use/)
• Open Transport Map (open dataset): Allows routing and visualization of traffic volumes of
the whole EU. The underlying data come from OpenStreetMap and are accessible in a
scheme compatible to INSPIRE Transport Network (http://opentransportmap.info/ )
• Smart Point of Interest (open dataset): Open and seamless SPOI data set, which is based
on Linked data principles, contains over 27 million Points of Interest important for tourism
from around the world. (http://sdi4apps.eu/spoi/ )
• Other datasets include: Urban atlas, Corine, Hilucs
INSPIRE Conference, Kehl, 5th Sept 2017
6. USED SOFTWARE/TOOLS
• D2RQ for transforming Relational Databases as Virtual
RDF Graphs
• RDF for the representation of data
• Ontologies providing the underlying vocabulary and
relations
• Virtuoso for storing the semantic datasets
• Sparql for querying semantic data
• Silk for discovery of links
• Hslayers NG for visualisation of data
• Metaphactory for visualisation of data
INSPIRE Conference, Kehl, 5th Sept 2017
7. PROJECT IDEA AND RESULTS
• The project idea was to integrate relevant datasets and
publish them as linked data. The following tasks were
performed:
• Massive transformation of data into semantic format (RDF) and
collection of exiting ones
• Loading datasets in Virtuoso
• Linking of datasets
• Query building
• Visualisation of data
• Results summary:
• Creation of ontologies (open)
• Virtuoso instance with over 700 million triples (open)
• Sparql endpoint (open)
• Three different interfaces for navigating and visualising the dataINSPIRE Conference, Kehl, 5th Sept 2017
8. NEW DATASETS
* Selected subsets
INSPIRE Conference, Kehl, 5th Sept 2017
Dataset Name Graph in FOODIE endpoint Source Triples
OLU** http://w3id.org/foodie/olu# Transformed from PostgreSQL 127,925,971
SPOI http://www.sdi4apps.eu/poi.rdf Source provided by WRLS,
modified and fixed before loading
381,393,555
NUTS http://nuts.geovocab.org/ Open Source 316,238
OTM*** http://w3id.org/foodie/otm# Transformed from PostgreSQL 154,340,611
Dataset Name Graph in FOODIE endpoint Source Triples
Hilucs classification http://w3id.org/foodie/hilucs# Transformed from PostgreSQL 397
Urban Atlas* http://w3id.org/foodie/atlas# Transformed from PostgreSQL 19,606,025
Corine* http://w3id.org/foodie/corine# Transformed from PostgreSQL 16,777,533
Eurovoc http://foodie-cloud.org/eurovoc Open Source 425,667
Emergel http://foodie-cloud.org/emergel CTIC 256,239
The ontologies generated are (available from https://github.com/FOODIE-cloud/ontology
15. INSPIRE/GEOSS/COPERNICUS/
RELEVANCE
• This work demonstrate the potential usages and
benefits of linked data with geospatial dimension
• The work exploits results from Copernicus and
INSPIRE
• The datasets generated are compliant with INSPIRE
INSPIRE Conference, Kehl, 5th Sept 2017
16. REUSE
• The results of the work will be exploited in DataBio
project, and will serve as showcase for pilots on the
potential usage of linked data, and how it could be
integrated and used with their pilot data
• The approach, and results of this work will be
leveraged and extended for task related to Big Data
Variety Management, Storage, Linked Data and
Queries
• The generated datasets could be reused in other
projects dealing with geospatial related data and
semantic technologies
INSPIRE Conference, Kehl, 5th Sept 2017
17. BUSINESS
• In future the applications on top of the linked
datasets can become commercial services for
different stakeholders. For instance
• Real estate agencies could use the datasets to show the
land parcels that you are on sale, that lie near big
highways and have school nearby
• Tourist agencies can show hotels that lie near some
point of interest and have direct connection to airports
or train stations
• Farmers can see the most dense land parcels nearby to
offer their products
INSPIRE Conference, Kehl, 5th Sept 2017
18. FOLLOW-UP
• We are ready to continue with follow up actions.
• We had people interested from the Institute for
Applied Informatics in Leipzig
• We will transform and link datasets from pilots of
FOODIE project, to demonstrate how farming data
compliant with FOODIE data model (that in turn is
compliant with INSPIRE) can be also linked with
these datasets. This work will be presented in the
Linked Data Workshop in Agriculture in Berlin.
INSPIRE Conference, Kehl, 5th Sept 2017