This document discusses linking hydrological observation datasets to specific hydrological features using standardized semantics. The authors implement an interface in OpenStreetMap to represent surface water features using standardized ontology terms. They model the interlinking between hydrological features and monitoring points using the VoID and DCAT vocabularies. This provides a flexible way to describe relationships between datasets and enables navigation between features and observations. The proof of concept implementation is inspired by challenges faced by a disaster monitoring organization and uses open data standards to lift datasets to the web of linked open data.
A Conceptual Building-Block and Practical OpenStreetMap-Interface for Sharing References to Hydrologic Features
1. W E R N E R L E Y H , B R A Z I L
H U M A N F A C T O R S ,
S U S T A I N A B L E U R B A N P L A N N I N G A N D I N F R A S T R U C T U R E ( S U P I )
S E S S I O N 2 2 0 ( S U P I / I ) , F R I D A Y , J U L Y 2 1 , 1 0 : 3 0 - 1 2 : 3 0 ,
L O S A N G E L E S , C A , U S A
h t t p : / / w w w . a h f e 2 0 1 7 . o r g / p r o g r a m 3 . h t m l
R e l a t e d w o r k a n d p u b l i c a t i o n :
h t t p s : / / w w w . s p r i n g e r p r o f e s s i o n a l . d e / a - c o n c e p t u a l - b u i l d i n g - b l o c k - a n d -
p r a c t i c a l - o p e n s t r e e t m a p - i n t e r f a c / 1 2 3 5 7 0 6 4
A CONCEPTUAL BUILDING-BLOCK AND
PRACTICAL OPENSTREETMAP-INTERFACE
FOR SHARING REFERENCES TO
HYDROLOGIC FEATURES
2. Overview: The great picture
Summarizing
Our
Contribution
To
SHARING REFERENCES
To
HYDROLOGIC FEATURES
3. Overview: The great picture – In 5W´s
What The overall envisioning objective is to enable the VIRTUAL
NAVIGATION between “COMPATIBLE” OBSERVATIONAL
DATASET on HYDROLOGICAL CONNECTIVITY.
Why One of the essential elements of life on this planet is freshwater. Sustainable
development with disaster preparedness therefore demands sustainable
management of the world’s limited freshwater resources. However, water
resources cannot be properly managed unless we know where they are, in
what quantity and quality, and how variable they are likely to be in the
foreseeable future.
How The technical framework of ENVIRONMENTAL DATA
AGGREGATION and UNIFIED DATA SHARING METHOD is
explored for distributed data integration with a (Linked Open Data) LOD-
enabled (Spatial Data Infrastructure) SDI-node.
When 2017
Where International
4. Overview: Outline and Content
Part 1: Motivation
Citizen Science (CS) in data driven surveys – Costs, problems and approaches
Part 2: Introduction
Context and Former work;
Part 3: Challenges
A standardized, semantic representation of the conceptual model “Hydrologic Cycle”
Part 4: Challenges
Facilitating the discovery of public datasets (Google)
Part 5: Contribution
Linking standardized hydrologic observations and features.
Part 6: Approach
System Architecture and Implementation
Part 7: Approach
Proof of Concept Implementation
Part 8: Results
Lifting Data to the Web of Data
Part 9: Conclusion
Summary
Annex
Context (Standardized Data Sharing in Hydrology)
5. Motivation: Citizen Science (CS) in data driven
surveys – Costs, problems and approaches
How can we improve
+ Scalability
+ Technology dependency
+ Volunteer management
+ Task complexity
+ Data quality
+ Sustainability:
How do
➢ organization and
➢ participation
influence
➢ scientific outcomes
in
➢ citizen science?
6. Introduction: Context and Former work
INTERESTED TO KNOW ABOUT BACKGROUND?
…and work published by other authors we are exploring in our present contribution ?
IN THE ANNEX WE ARE INTRODUCING
➢ The conceptional separation between physical and logical hydrological networks
➢ Hydrological features
PLEASE CONSIDER ALSO OUR MAIN REFERENCES:
Shina et al. https://link.springer.com/chapter/10.1007/978-3-319-11593-1_13
Varanka et al. www.cartogis.org/docs/proceedings/2016/Varanka_and_Cheatham.pdf
WMO www.gtn-h.info/wp-content/uploads/2015/10/GTNH-7_Report.pdf
WMO/UNESCO https://public.wmo.int/en/resources/library/international-glossary-hydrology-wmo-no-385
Atkinson et al. https://link.springer.com/chapter/10.1007/978-3-319-15994-2_11
W3C WG/11-05-17 https://www.w3.org/TR/sdw-bp/
Bressiani et al. https://ijabe.org/index.php/ijabe/article/view/1765
7. Challenges: A standardized, semantic representation
of the conceptual model “Hydrologic Cycle”
……. compatible with the terminology already endorsed by WMO and
UNSECO:
As it regards to interoperability of logical models in the field of hydrology,
a significant effort for a standardized, semantic representation is undertaken by the
World Meteorological Organization (WMO).
However, at present there is not yet an WMO, UNESCO and/or OGC standard
for this and its technology is not enough mature to use it at this moment.
Please compare:
WHO: www.gtn-h.info/wp-content/uploads/2015/10/GTNH-7_Report.pdf
8. Challenges:
Facilitating the discovery of public datasets (Google)
According to the authors , while Google has recently released guidelines on
publishing metadata (Last updated March 28, 2017, please compare:
https://research.googleblog.com/2017/01/facilitating-discovery-of-public.html)
➢ “many technical challenges remain before search for data becomes as
seamless as we feel it should be” (Sect. 1.3).
Interestingly, in the same report we can read regarding DCAT:
➢ “The structure is very close to that used in the W3C DCAT specification. We
expect to add a DCAT example in a future revision of these guidelines.”
➢ (The DCAT-Standard was developed by W3C and the European Union for
describing public sector datasets. Its basic use case is to enable cross-data
portal search for data sets and make public sector data better searchable
across borders and sectors.).
9. Contribution: Linking standardized hydrologic
observations and features.
➢ This paper describes approaches how observational datasets may be
linked to (and “contextualized” with) specific hydrological features.
➢ We show further how domain models can be used to standardize links
between different features.
➢ Authors use “standardized” terms directly as OSM-TAGS describing POIs
inserted by volunteer citizens to represent “Surface Water Features”
controlled by an abstract ontology.
➢ We use hereby standard W3C - vocabularies, including the DCAT, CSVW,
VoID, PROV, DQV and Schema.
10. Approach:
System Architecture and Implementation
➢ We use the DCAT vocabulary to describe the available datasets, groups of
datasets and catalogs.
➢ The interlinking is modelled by a linkset (void:Linkset) that describes
relationships between hydrological features and monitoring points,
described using WaterML2.0.
➢ A linkset in VoID is a subclass of a dataset, used for storing triples to express
the interlinking relationship between datasets.
➢ This modelling enables a flexible and powerful way to talk in great detail
about the interlinking between two datasets, such as how many links there
exist, which kind of links (e.g. owl:sameAs or foaf:knows) are present, or stating
who claims these statements.
➢ This provides the data backbone allowing navigation between specific
features (e.g. rivers) and observations (e.g. height, flows, water quality) (Please
compare https://link.springer.com/chapter/10.1007/978-3-319-15994-2_11).
11. Approach: Proof of concept
Fig. 1. Number of hospitals around the flooding area: where to get this information?
12. Approach: Proof of Concept Implementation
The proof of concept implementation of AGORA’s LOD-enabled SDI-Node as
practical use case and research platform (please compare:
http://www.preventionweb.net/files/45270_200.pdf)
The close cooperation with the National Centre for Monitoring and Warning
of Natural Disasters CEMADEN (http://www.cemaden.gov.br/):
This means our academic research work is inspired by practical real-life
challenges faced by CEMADEN in its daily work;
15. Conclusion:
Summary
The effective exchange of hydrologic data containing references to hydrologic, physical
features requires standardized semantics of the concepts that allow identification of
these features.
To this end, we implemented within the volunteered geographic information (VGI)
platform OpenStreetmap (OSM) an interface by using “standardized” terms directly as
OSM-TAGS describing POIs inserted by volunteer citizens to represent “Surface Water
Features” controlled by an abstract ontology of surface water features based only on
those physical properties of landscape features.
The interlinking is modelled by a linkset (void:Linkset) that describes relationships
between hydrological features and monitoring points. This modelling enables a flexible
and powerful way to talk in great detail about the interlinking between two datasets,
such as how many links there exist, which kind of links (e.g. owl:sameAs or foaf:knows)
are present, or stating who claims these statements.
This provides the data backbone allowing navigation between specific features (e.g.
rivers) and observations (e.g. height, flows, water quality).
16. Acknowledgments
This research has been supported by the Brazilian Capes Foundation
(Programa de Apoio ao Ensino e à Pesquisa Científica e Tecnológica em Desastres
Naturais, Pró-Alertas).
We also thank Microsoft Research for offering free access to cloud computing resources
based on the Microsoft AZURE framework for the present research project (Microsoft
Azure sponsorship for University of Sao Paulo till 2016/05/01).
17. Werner Leyh
h t t p s : / / w i k i . o s g e o . o r g / w i k i / U s e r : W e r n e r L e y h
Grupo de Pesquisa CNPq/USP
I N F R A E S T R U T U R A D E D A D O S E S P A C I A I S ( G E P I D E )
h t t p : / / d g p . c n p q . b r / b u s c a o p e r a c i o n a l / d e t a l h e g r u p o . j s p ? g r u p o =
0 0 6 7 1 0 7 H R Y 8 K T 0
Questions ?
Interested in linking Wikidata, Openstreetmap and
scientific Datasets?
Join us !
19. Introduction:
Data Vocabularies
➢ Vocabularies define the “concepts” and “relationships” (also referred to as “terms”
or “attributes”) used to describe and represent an area of interest.
➢ They are used to “classify” the terms that can be used in a “particular application”,
characterize “possible relationships”, and define “possible constraints” on using
those terms.
➢ Several near-synonyms for Vocabulary have been coined, for example, “onto-
logy”, “controlled Vocabulary”, “thesaurus”, “taxonomy”, “code list”, “semantic
network” (www.gtn-h.info/wp-content/uploads/2015/10/GTNH-7_Report.pdf).
➢ There is no strict division between the artifacts referred to by these names.
“Ontology tends however to denote the Vocabularies of classes and properties” that
structure the descriptions of resources in (linked) datasets.
➢ Ontologies are the “key building blocks” for inference techniques on the
“Semantic Web” (www.gtn-h.info/wp-content/uploads/2015/10/GTNH-
7_Report.pdf).
20. Introduction:
Standardized Data Sharing in Hydrology
➢ WMO Executive Council provides advice and assistance on technical aspects of
the implementation of the practice on the international exchange of hydrological data
and products.
➢ Hydrologic features are units of hydrologic information required to convey
identity of real-world water-objects through the data processing chain from
observation to water information and identified under the umbrella of the joint
WMO-UNESCO Glossary of Hydrology.
➢ A logical model is the representation of the managed water supply system
components and relations that acts as interface between the water manager and the
water management ontology.
➢ Any logical model has a correspondence to a physical model.
➢ A physical model is a collection of real elements that match a structure
consisting of a geographical positioning component and other associated information
(Please compare www.gtn-h.info/wp-content/uploads/2015/10/GTNH-7_Report.pdf).
21. Introduction: Linked Open Data (LOD)
➢ Many publishers and funding agencies nowadays require that scientists
make their research data available publicly: Access to this data is critical to
facilitating reproducibility of research results, enabling scientists to build
on others’ work, and providing data journalists easier access to information and its
provenance.
➢ Due to the volume of data repositories available on the Web, it can be
extremely difficult to determine not only where is the dataset that has the
information that you are looking for, but also the veracity or provenance of that
information.
➢ Google recently published new guidelines to help data providers
describe their datasets in a structured way, enabling Google and others to
link this structured metadata with information describing locations, scientific
publications, or even Knowledge Graph, facilitating data discovery for others
(please compare: https://research.googleblog.com/2017/01/facilitating-discovery-
of-public.html ).
22. Overview, context and definitions:
The broader objective
Opening Accessible and Comprehensive Environmental Risk Data - A General
Open Data Strategy.
In 2011, the European Commission published its Open Data Strategy
defining the following six barriers 14 for “open public data”:
I. lack of information that certain data actually exists and is available,
II. lack of clarity of which public authority holds the data,
III. lack of clarity about the terms of re-use,
IV. data made available in formats that are difficult or expensive to use,
complicated licensing procedures or prohibitive fees.