The document describes frameworks for curating and visualizing biomass knowledge bases using semantic web and linked data technologies. It outlines the LEAPS framework which exploits semantic web and linked data to enable screening of algal plant sites and sharing of base data. LEAPS defines ontologies for the domain, transforms disparate data formats into linked data using the ontologies, and provides access to the data through a SPARQL endpoint, interactive map interface, and REST API. It also describes the ASPIRE recommendation system and approaches for visualizing taxonomic data from sources like Algaebase.
1. EFITA 2013, 26th June, Torino
From Biomass to Energy through
Semantic Web and Linked data
Frameworks for the curation and visualisation of
biomass knowledge bases
Monika Solanki
Aston Business School
Aston University, Birmingham, UK
Joint work with
Johannes Skarka
Karlsruhe Institute of Technology, ITAS
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
2. EFITA 2013, 26th June, Torino
EnAlgae: Energetic Algae (Since June 2011)
Aims to reduce CO2 emissions and dependency on
unsustainable energy sources in North West Europe.
4 Year Strategic initiative of Interreg IVb NWE programme.
19 partners and 14 Observers across 7 EU states.
Coordinated set of activities focusing on sharing best
practice, developing effective stakeholder engagement and
encouraging transnational cooperation.
http://www.enalgae.eu/
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
3. EFITA 2013, 26th June, Torino
Algal biomass as biofuels
Extensive research* is being undertaken in the search and
production of naturally viable and sustainable energy
sources.
The idea that algae biomass based biofuels could serve as
an alternative to fossil fuels has been embraced by
councils across the globe.
Major companies, government bodies and dedicated non
profit organisations* are getting involved.
The domain is a rich source of data/information/knowledge.
*http://www.algalbiomass.org/
*http://www.eaba-association.eu/
*http://www.enalgae.eu/
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
4. EFITA 2013, 26th June, Torino
Algal biomass as biofuels: Observations
No systematic analysis of the algae biomass potential for
North-Western Europe.
Most of the knowledge buried in various formats of images,
spreadsheets, proprietary data sources and grey literature.
Lack of a knowledge level infrastructure that is equipped
with the capabilities to provide semantic grounding to the
datasets for algal biomass.
Low levels of motivation among stakeholders, for datasets
to be interlinked, shared and reused within the biomass
community.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
5. EFITA 2013, 26th June, Torino
Knowledge Frameworks for Algal Biomass
Transformation and curation of biomass knowledge bases
in accordance to Semantic Web and Linked data
standards.
Ontology design patterns for building ontologies for the
biomass domain.
LEAPS: A framework that enables stakeholders in the
algal biomass domain to interactively explore, via linked
data, potential algal sites and sources of their
consumables across regions in North-Western Europe for
generation of bioenergy.
ASPIRE: A content based recommendation engine that
provides recommendations for algal entities as per
stakeholder preference profiles using bespoke proximity
search algorithms.
Visualisations over linked biomass datasets.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
6. EFITA 2013, 26th June, Torino
SW, Linked data and the Algal Supply Chain
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
7. EFITA 2013, 26th June, Torino
Ontological requirements
Ontologies needed to represent
Spatiality: location of possible algae cultivation sites,
location of the sources of consumables (CO2, nutrients
and water).
Geometries: area of the cultivation site - extents,
polygons, linear and ring arrays.
Units and Measurements: conventional measurement
units such as Kgs for quantities and hectares for area,
bespoke units of measurements, i.e., Kgs/hectare or
Kgs/annum.
Territorial units for statistics: core concepts of the NUTS
system.
Domain specific knowledge: algae cultivation sites, CO2
sources, pipelines.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
8. EFITA 2013, 26th June, Torino
Minimum Descriptive Language (MDL)
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
9. EFITA 2013, 26th June, Torino
Ontology Design Pattern
Standards Enforcer Pattern (SEP)
Enables the ontological modelling of processes, activities,
operations and services that enforce guideline(s)
recommended by a specific standard and need to explicitly
indicate their conformance to it.
Allows the inclusion of minimalistic information regarding
the conformance, while retaining the flexibility to extend the
ontological primitives as required.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
10. EFITA 2013, 26th June, Torino
Ontology Design Pattern
Standards Enforcer Pattern (SEP)
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
11. EFITA 2013, 26th June, Torino
Ontology Design Pattern
REactor Pattern (REP)
Enables the ontological modelling of reactive processes in
a generic way across multiple domains.
Targeted towards modelling reactive processes with a
black box view of the process.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
12. EFITA 2013, 26th June, Torino
Ontology Design Pattern
REactor Pattern (REP)
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
13. EFITA 2013, 26th June, Torino
Ontologies for Algal Biomass: Reuse
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
14. EFITA 2013, 26th June, Torino
Ontology Development Methodology
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
15. Ontologies for Algal Biomass: Domain
knowledge
Ontologies available at http:/purl.org/biomass/ontologies
16. EFITA 2013, 26th June, Torino
Lifting XML datasets to Linked data
First step
The first part of the data processing and the potential
calculation are performed in a GIS-based model which was
developed for this purpose using ArcGIS.
Raw datasets with various origins and formats -
transformed using bespoke computational algorithms to an
ArchGIS specific XML format.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
17. EFITA 2013, 26th June, Torino
Lifting XML datasets to Linked data
Second step
The original data sources had several limitations and a
one-to-one transformation was not possible.
A bespoke parser that exploits XPath to selectively query
the XML datasets and generate linked data was
implemented.
It utilises a complex underlying data structure to facilitate
the transformation.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
18. EFITA 2013, 26th June, Torino
Lifting XML datasets to Linked data
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
19. EFITA 2013, 26th June, Torino
LEAPS
Linked Entities for Algal Plant Sites
motivate the use of Semantic Web technologies and LOD
for the algal biomass domain.
laying out a set of ontological requirements for knowledge
representation that support the publication of algal
biomass data.
elaborating on how algal biomass datasets are transformed
to their corresponding RDF model representation.
interlinking the generated RDF datasets along spatial
dimensions with other datasets on the Web of data.
visualising the linked datasets via an end user LOD REST
Web service.
The first (known) application of SW/LD to Algal Biomass datasets
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
20. EFITA 2013, 26th June, Torino
System Architecture
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
21. EFITA 2013, 26th June, Torino
Architecture: Main components
Parsing modules: lifting the data
from their original formats to RDF.
Ontologies.
Linking engine: producing the linked
data representation of the datasets.
Triple store: OWLIM SE 5.0.
REST Web services.
SPARQL endpoints.
Web Interface.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
22. EFITA 2013, 26th June, Torino
LEAPS Web application
www.semanticwebservices.org/enalgae
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
23. EFITA 2013, 26th June, Torino
ASPIRE
A content based recommendation service for algal site
entities.
Recommendations are made based on stakeholder
preference models defined in their ontological profiles as
linked data.
Algal datasets are a combination of continuous and
categorical data entities.
An adaption of Gower’s similarity measure is used to
computed similarities between the entities to propose
recommendations.
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
24. EFITA 2013, 26th June, Torino
Biological taxonomy visualisation
Algaebase is the largest information source of algae on the
Web.
The algaebase dataset is not directly available to be
downloaded.
The dataset was retrieved using a bespoke information
retrieval algorithm and curated within our triple store as
linked data.
The Semantic Import plugin of Gephi has been exploited to
visualise the biological taxonomy of algae.
http://www.algaebase.org/
https://gephi.org/
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
25. EFITA 2013, 26th June, Torino
Biological taxonomy visualisation
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data
26. Summary
The LEAPS framework exploits SW and LD for the algal
biomass community,
enabling the screening of data for promising individual
plant sites and provides base data for more detailed
planning purposes.
proposing a set of domain specific ontologies to be shared
and extended by the community.
defining a linked data publishing architecture that
transforms raw data in disparate formats to a uniform XML
representation.
using a set of well established and domain specific
ontologies as metadata to transform it further into linked
data.
providing various data access options such as a SPARQL
endpoint, an interactive Google map interface and a REST
API for making the data accessible to stakeholders.
27. EFITA 2013, 26th June, Torino
Many Thanks!!!
m.solanki@aston.ac.uk From Biomass to Energy through Semantic Web and Linked data