Amit Sheth, 'Spatial Semantics for Better Interoperability and Analysis: Challenges and Experiences in Building Semantically Rich Applications in Web 3.0,' Keynote Talk, 3rd Annual Spatial Ontology Community of Practice Workshop: Development, Implementation and Use of Geo-Spatial Ontologies and Semantics, USGS, Reston, VA, December 03, 2010.
Physical Cyber Social Computing: An early 21st century approach to Computing ...
Similaire à Spatial Semantics for Better Interoperability and Analysis: Challenges and Experiences in Building Semantically Rich Applications in Web 3.0
Similaire à Spatial Semantics for Better Interoperability and Analysis: Challenges and Experiences in Building Semantically Rich Applications in Web 3.0 (20)
Spatial Semantics for Better Interoperability and Analysis: Challenges and Experiences in Building Semantically Rich Applications in Web 3.0
1. CitationCitation
Semantic Provenance: Trusted
Biomedical Data Integration
Spatial Semantics for Better Interoperability
and Analysis:
Challenges And Experiences In Building Semantically Rich Applications In Web 3.0
(Keynote at the 3rd Annual Spatial Ontology Community of Practice Workshop (SOCoP),
USGS Reston, VA, December 03, 2010)
Amit Sheth
LexisNexis Ohio Eminent Scholar
Ohio Center of Excellence in Knowledge-enabled Computing – Kno.e.sis
Wright State University, Dayton, OH
http://knoesis.org
Thanks: Cory Henson, Prateek Jain
& Kno.e.sis Team. Ack: NSF and other
Funding sources.
2. Semantics as core enabler, enhancer @ Kno.e.sis
2
15 faculty
45+ PhD students & post-docs
Excellent Industry collaborations
(MSFT, GOOG, IBM, Yahoo!, HP)
Well funded
Multidisciplinary
Exceptional Graduates
3. Web (and associated computing) evolving
Web of pages
- text, manually created links
- extensive navigation
2007
1997
Web of databases
- dynamically generated pages
- web query interfaces
Web of resources
- data, service, data, mashups
- 4 billion mobilecomputing
Web ofpeople, Sensor Web
- social networks, user-created casual content
- 40 billion sensors
Web as an oracle / assistant / partner
- “ask the Web”: using semantics to leverage
text + data + services
Computing for Human Experience
Keywords
Patterns
Objects
Situations,
Events
Enhanced Experience,
Tech assimilated in life
Web 3.0
Web 2.0
Web 1.0
http://bit.ly/HumanExperience
4. Variety & Growth of Data
• Variety/Heterogeneity
Many intelligent applications that involve fusion and
integrated analysis of wide variety of data
Web pages/documents, databases, Sensor Data, Social/Community/Collective
Data (Wikipedia), Real-time/Mobile/device/IoT data, Spatial
Information, Background Knowledge (incl. Web of Data/Linked Open
Data), Models/Ontologies…
• Exponential growth for each data: e.g. Mobile Data
2009: 1 Exabyte (EB)
2010 US alone: 40+ EB.
Estimate of 2016-17 (Worldwide): 1 Zettabyte (ZB) or 1000 Exabytes.
(Managing Growth & Profits in the Yottabytes Era, Chetan Sharma Consulting, 2009).
5. A large class of Web 3.0 applications…
• utilize larger amount of historical and recent/real-
time data of various types from multiple sources (lot
of data has spatial property)
• not only search, but analysis of or insight from data –
that is applications are more “intelligent”
• This calls for semantics: spatial, temporal, thematic
components; background knowledge
• This talk: spatial semantics as a key component in
building many Web 3.0 applications
6. A Challenging Example Query
What schools in Ohio should now be closed due
to inclement weather?
Need domain ontologies and rules to describe type
of inclement weather and severity.
Integration of technologies needed to answer query
1. Spatial Aggregation
2. Semantic Sensor Web
3. Machine Perception
4. Linked Sensor Data
5. Analysis of Streaming Real-Time Data
6
8. Spatial Aggregation
• Utilizes partonomy in order to aggregate
spatial regions
• To query over spatial regions at different
levels of granularity
• Data represents “low-level” districts (school
in district)
• Query represents “high-level” state (school
in state)
8
12. Why is This Issue Relevant?
• Spatial data becoming more significant
day by day.
• Crucial for multitude of applications:
– Social Networks like Twitter, Facebook …
– GPS
– Military
– Location Aware Services: Four Square Check-In
– weather data…
• Spatial Data availability on Web
continuously increasing.
Twitter Feeds, Facebook posts.
Naïve users contribute and correct spatial data too
which can lead to discrepancies in data representation.
E.g. Geonames, Open Street Maps
12
13. What We Want
User’s Query
Spatial
Information of
Interest
Automatically align conceptual mismatches
Semantic
Operators
13
14. What is the Problem?
• Existing approaches only analyze spatial information
and queries at the lexical and syntactic level.
• Mismatches are common between how a query is
expressed and how information of interest is
represented.
• Question: “Find schools in NJ”.
• Answer: Sorry, no answers found!
• Reason: Only counties are in states.
•Natural language introduces much ambiguity
for semantic relationships between entities in
a query.
•Find Schools in Greene County.
14
15. What Needs to be Done?
• Reduce users’ burden of having to know how
information of interest is represented and
structured to enable access by broad population.
• Resolve mismatches between a query and
information of interest due to differences in
granularity to improve recall of relevant
information.
• Resolve ambiguous relationships between entities
based on natural language to reduce the amount of
wrong information retrieved.
15
16. Existing Mechanism for Querying RDF
• SPARQL
• Regular Expression Based Querying
Approaches
16
17. Common Query Testing All Approaches
“Find Schools Located in the State of Ohio”
17
18. In a Perfect Scenario
parent featureSchool Ohio
18
19. In a Not so Perfect Scenario
Countyparent
feature
School Ohioparent
feature
19
20. Proposed Approach
• Define operators to ease writing of expressive queries by
implicit usage of semantic relations between query terms
and hence remove the burden of expressing named
relations in a query.
• Define transformation rules for operators based on work by
Winston’s taxonomy of part-whole relations.
• Rule based approach allows applicability in different
domains with appropriate modifications.
• Partonomical Relationship Based Query Rewriting System
(PARQ) implements this approach.
21
21. Meta Rules for Winston’s Categories
Transitivity
(a φ-part of b) (b φ-part of c) (a φ-part of c)
Dayton place-part of Ohio Ohio place-part of US Dayton place-part of US
Sri Lank place-part of
Indian Ocean
Sri Lank place-part of Bay
of Bengal
Indian Ocean overlaps
with Bay of Bengal
White House instance of
Building
Barack is in
the White House
Barack is
In the building
Overlap
(a place-part of b) (a place-part of b) (b overlaps c)
Spatial Inclusion
(a place-part of b) (a place-part of b) (b overlaps c)
22
23. Where Do We Stand With All Mechanisms..
Ease of
Writing Expressivity
Works
in all
scenarios
Schema
agnostic
SPARQL X √ X X
PSPARQL √ √ X √
Our
Approach
(PARQ)
√ √ √ √
24
24. Evaluation
• Performed on publicly available datasets
(Geonames and British Ordnance Survey Ontology)
• Utilized 120 questions from National
Geographic Bee and 46 questions from trivia
related to British Administrative Geography
• Questions serialized into SPARQL Queries by 4
human respondents unfamiliar with ontology
• Performance of PARQ compared with
PSPARQL and SPARQL
25
25. Sample Queries
• “In which English county, also known as
"The Jurassic Coast" because of the many
fossils to be found there, will you find the
village of Beer Hackett?”
• “The Gobi Desert is the main physical
feature in the southern half of a country
also known as the homeland of Genghis
Khan. Name this country.”
26
27. PARQ - vs - PSPARQL
System Precision Recall Execution
time/query
in seconds
PARQ 100% 86.7% 0.3976
PSPARQL 6.414% 86.7% 37.59
Comparison for National Geographic Bee over Geonames
System Precision Recall Execution
time/query
in seconds
PARQ 100% 89.13% 0.099
PSPARQL 65.079% 89.13% 2.79
Comparison for British Admin. Trivia over Ordnance Survey Dataset
28
28. Spatial Aggregation Conclusion
• Query engines expect users to know the dataset
structure and pose well formed queries
• Query engines ignore semantic relations
between query terms
• Need to exploit semantic relations between
concepts for processing queries
• Need to provide systems with behind the scenes
rewrite of queries to remove burden of knowing
structure of data
29
29. CitationCitation
Technology 2
Semantic Sensor Web (SSW)
• What is inclement weather?
• What sensors in Ohio are capable of detecting inclement weather?
• What sensors are near schools in Ohio?
• What observations are these sensors generating NOW?
• Are these observations providing evidence for inclement weather?
30
30. Semantic Sensor Web
Utilizes ontologies to represent and analyze
heterogeneous sensor data
• Sensor-observation ontology
• Spatial ontology
• Temporal ontology
• Domain ontologies (i.e., weather ontology)
Generates abstractions (that matter to human
decision making) over sensor data
• Analysis of data to detect and represent interesting
features (i.e., objects, events, situations)
31. Environment
Event ID/Understanding,
Situation Awareness
Sensor
Sensor Data
Observation
Perception
Utilizes semantic technologies to bridge the divide
between the “real-world” and the Web (critical to
Cyber-Physical systems)
Physical Space (“real-world”)
Information Space (Web)
32
Semantic Sensor Web
32. Sensors are now ubiquitous,
and constantly generating
observations about our world
33
33. However, these systems are often stovepiped,
with strong tie between sensor network and application
34
36. Web Services
- OGC Sensor Web Enablement (SWE)
1) How to discover, access and search the data?
37
37. when it comes from many different sources?
Shared knowledge models, or Ontologies
- syntactic models – XML (SWE)
- semantic models – OWL/RDF (W3C SSN-XG)
2) How to integrate this data together
38
38. The SSN-XG Deliverables
• Ontology for semantically describing sensors
• Illustrate the relationship to OGC Sensor Web
Enablement standards
• Semantic annotation of OGC Sensor Web
Enablement standards
39
39. Symbols more meaningful than numbers
- analysis and reasoning (understanding through perception)
3) Make streaming numerical sensor data
meaningful to web applications
and naïve users?
43. Machine Perception
• Task of extracting meaning from sensor data
• Perception is the act of choosing from alternative
explanations for a set of observations (Intellego Perception)
• Perception is a active, cyclical process of explaining
observations by actively seeking – or focusing on –
additional information (Active Perception)
• Active Perception cycle is driven by prior knowledge
44
45. Formal Theory of Machine Perception
• Specification
• Implementation
• Evaluation
Ontology of Perception: A Semantic Web Approach to Enhance
Machine Perception (Technical Report, Sept. 2010)
46
46. Enable Situation Awareness on Web
Must utilize abstractions capable of
representing observations and perceptions
generated by either people or machines.
observe perceive
Web
“Real-World”
47
47. Observation of Qualities
Both people and machines are capable of
observing qualities, such as redness.
Formally described in a sensor/observation ontology
Observer Quality
observes
48
48. Perception of Entities
Both people and machines are also capable
of perceiving entities, such as apples
* Formally described in a perception ontology
Perceiver Entity
perceives
49
49. Background Knowledge
Ability to perceive is afforded through the use of
background knowledge. For example, knowledge that
apples are red helps to infer an apple from an observed
quality of redness.
Formally described in a domain ontology
Quality
Entity
inheres in
50
50. Perception Cycle
The ability to perceive efficiently is afforded through the
cyclical exchange of information between observers and
perceivers.
Traditionally called the Perception Cycle (or Active Perception)
Observer
Perceiver
sends
focus
sends
percept
51
51. Integrated Perception Cycle
Integrated together, we have an abstract model –
capable of situation awareness – relating observers,
perceivers, and background knowledge.
Observer Quality
observes
Perceiver Entity
inheres in
perceives
sends
focus
sends
percept
52
54. Evaluation of Perception Cycle
55
We demonstrated 50% savings in resource requirements by
utilizing background knowledge within the Perception Cycle
55
56. CitationCitation
Technology 4
Linked Sensor Data
• What schools are in Ohio?
• What inclement weather necessitates school closings?
• What sensors in Ohio are capable of detecting inclement weather?
• What sensors are near schools in Ohio?
• What observations are these sensors generating NOW?
57
57. Linked Sensor Data
• Knowledge/representations from SSW are
accessible on LOD
• LinkedSensorData
• Descriptions of ~20,000 weather stations
• Weather stations linked to featured defined in
Geonames.org
• LinkedObservationData
• Description of storm related observations
• ~1.7 billion triples, ~170 million weather
observations
• Updated in real-time with current observations
and abstractions
58
59. What is Linked Sensor Data
Weather Sensors
Camera SensorsSatellite Sensors
GPS SensorsSensor Dataset
60
60. • RDF descriptions of ~20,000 weather stations in the United
States.
• Observation dataset linked to sensors descriptions.
• Sensors link to locations in
Geonames (in LOD)
that are nearby.
weather station
Sensors Dataset (LinkedSensorData)*
*First Initiative for exposing Sensor Data on LOD61
61. What is Linked Sensor Data
Sensor
Dataset
Publicly Accessible
Recommended best practice for
exposing, sharing, and connecting
pieces of data, information, and
knowledge on the Web using URIs
and RDF
RDF – language for
representing data on the
Web
GeoNames
Dataset
62
62. • RDF descriptions of hurricane and blizzard observations in the
United States.
• The data originated at MesoWest (University of Utah)
• Observation types:
temperature, visibility, precipitation, pressure, wind
speed, humidity, etc.
Observations Dataset
(LinkedObservationData) – Static Datasets
63
64. Sensor Discovery Application
Weather Station ID
Weather Station
Coordinates
Weather Station
Phenomena
Current Observations
from MesoWest
MesoWest – Project
under Department of
Meteorology, University
of UTAH
GeoNames – Geographic
dataset
65
65. Sensor Discovery on Linked Data Demo
• http://knoesis.org/projects/sensorweb/demos/sensor_discovery_on_lod/sample.htm
67. Analysis of Streaming Real-Time Data
• Conversion from raw data to semantically
annotated data in real-time
• Analysis of data to generate abstractions in
real-time
68. Real Time Streaming Sensor Data
Semantic Analysis using
Ontology for Event Detection
Storing Abstractions (Events)
obtained after reasoning on
the LOD
75. The Query
What schools in Ohio should now be
closed due to inclement weather?
–needs to be divided into sub-queries
that can be answered using
technologies previously described
76
76. What Schools Are in Ohio?
• Need partonomical spatial relations
• What counties are contained in Ohio?
• What districts are contained in a county?
• What schools are contained in a district?
Uses: spatial aggregation and LOD
• Geonames.org contains these partonomical
spatial relations
• Spatial aggregation executes the partonomical
inference to convert the general query into
sub-queries that can be answered
77
77. What is Inclement Weather?
• Need domain ontology that describes
characteristics of inclemental weather
• Example
Icy Roads => freezing temperature &
precipitation (rain or snow)
• Uses: SSW
78
78. What Inclement Weather Necessitates
School Closings?
• Need school policy information on rules for
closing (e.g., for icy road conditions)
• Data.gov on LOD contains large amount of
such policy information
• Uses: LOD
79
79. What Sensors in Ohio Are Capable of
Detecting Inclement Weather?
• Need ontological descriptions of sensors and
weather in order to match sensor capabilities
to weather characteristics
• Temperature sensor freezing temperature
• Rain gauge sensor precipitation
• LinkedSensorData has descriptions of ~20,000
weather stations on LOD
• Uses: SSW and LOD
80
80. Sensors Near Schools in Ohio?
• Spatial analysis: match school locations (in
Ohio) to sensor locations that are nearby
• Sensor descriptions in LinkedSensorData
contain links to nearby features (such as
schools)
• Uses: SSW and LOD
81
81. What Observations are These Sensors
Generating NOW?
• Need to semantically annotate raw streaming
observations in real-time
• Need to make these current/real-time
annotations accessible by placing them on
LOD (i.e., LinkedObservationData)
• Uses: SSW, LOD, Streaming Data
82
82. Are These Observations Providing
Evidence for Inclement Weather?
• Analysis of observation data using background
knowledge
• Generation of abstractions that are easier to
understand
• Uses: SSW, Perception
83
83. References
Spatial Aggregation References (http://knoesis.org/research/semweb/projects/stt/)
• Prateek Jain, Peter Z. Yeh, KunalVerma, Cory Henson and AmitSheth, SPARQL Query Re-writing for Spatial Datasets Using Partonomy
Based Transformation Rules, 3rd Intl. Conference on Geospatial Semantics (GeoS 2009), Mexico City, Mexico, December 3-4, 2009.
• Alkhateeb, F., Baget, J.-F., Euzenat, J.: Extending SPARQL with regular expression patterns (for querying RDF). Web Semantics 7, 2009.
Semantic Sensor Web References (http://wiki.knoesis.org/index.php/SSW)
• Cory Henson, Josh Pschorr,Amit Sheth, Krishnaprasad Thirunarayan, SemSOS: Semantic Sensor Observation Service, in Proceedings of
the 2009 International Symposium on Collaborative Technologies and Systems (CTS 2009), Baltimore, MD, May 18-22, 2009.
• Cory Henson, Holger Neuhaus, Amit Sheth, Krishnaprasad Thirunarayan, Rajkumar Buyya, An Ontological Representation of Time
Series Observations on the Semantic Sensor Web, in Proceedings of 1st International Workshop on the Semantic Sensor Web 2009.
• Michael Compton, Cory Henson, Laurent Lefort, Holger Neuhaus, A Survey of the Semantic Specification of Sensors, 2nd International
Workshop on Semantic Sensor Networks, 25-29 October 2009, Washington DC.
Machine Active Perception References
• Cory Henson, Krishnaprasad Thirunarayan, Pramod Anatharam, Amit Sheth, Making Sense of Sensor Data through a Semantics Driven
Perception Cycle, Kno.e.sis Center Technical Report, 2010.
• Krishnaprasad Thirunarayan, Cory Henson, Amit Sheth, Situation Awareness via Abductive Reasoning for Semantic Sensor Data: A
Preliminary Report, In: Proceedings of 2009 International Symposium on Collaborative Technologies and Systems (CTS 2009), pp. 111-
118, May 18-22, 2009.
Linked Sensor Data References (http://wiki.knoesis.org/index.php/LinkedSensorData)
• Harshal Patni, Cory Henson, Amit Sheth, Linked Sensor Data, In: Proceedings of 2010 International Symposium on Collaborative
Technologies and Systems (CTS 2010), Chicago, IL, May 17-21, 2010.
• Harshal Patni, Satya S. Sahoo, Cory Henson and Amit Sheth, Provenance Aware Linked Sensor Data, 2nd Workshop on Trust and
Privacy on the Social and Semantic Web, Co-located with ESWC, Heraklion Greece, 30th May - 03 June 2010
• Joshua Pschorr, Cory Henson, Harshal Patni, Amit P. Sheth, Sensor Discovery on Linked Data, Kno.e.sis Center Technical Report, 2010
Knoesis center recently declared a center of excellence by Ohio governor: http://bit.ly/coe-k
Knoesis center recently declared a center of excellence by Ohio governor
A mechanism to improve access to spatial information by allowing users
Knoesis center recently declared a center of excellence by Ohio governor
Making sense of data from many sources in many form is challenging
Knoesis center recently declared a center of excellence by Ohio governor
Observer - agent that executes an observation process.Observation Process – process of detecting qualities from stimuli, consisting of the following steps: (1) choose an observable quality, (2) find stimuli that are causally linked to the observable quality, (3) detect the stimuli, and (4) generate observation values. Observation values (or percepts) are symbols that represent the observed qualities in the physical world.Observation – action by an observer of executing the observation process
Perceiver - agent that executes a perception processPerception Process – process of detecting entities from observed qualities (abductive inference). Note that the perception process actually infers concepts (symbols) that represent entities in the physical world.Perception – action by a perceiver of executing the perception process
Quality – property in the physical world that may be observed (i.e., accessible to the senses through stimuli); qualities inhere in entities.Entity – object, event, or situation in the physicalworld (not directly accessible to the senses and must be inferred* from observations of qualities and background knowledge). *Actually, concepts (symbols) are inferred that represent entities in the physical world.Background knowledge – set of relations between qualities and entities known to a perceiver
Perception Cycle - a process that relates observers and perceivers; the observer communicates percepts to the perceiver, representing qualities that have been observed, and the perceiver communicates focus to the observer, representing qualities that should be observed. Percept – is a symbol that represents an observed quality (also sometimes called observation value).Focus – guides the observer towards only those qualities necessary for effective perception (in the form of a quality type).* Evaluation – in our experiments, we have shown that focus can reduce the number of observations necessary for perception by up to 50%.
Integration of the threeontologies discussed above: sensor/observation ontology relating observers (or sensors) to observable qualities, perception ontology relating perceivers to perceivable (inferable) entities, and domain ontology relating qualities and entities in the physical world
The top figure shows the results of executing the perception cycle for the 516 weather stations within a radius of 400 miles of the blizzard in Utah, and for each ordering of quality types. The horizontal axis represents the different orderings of observable qualities; p represents precipitation, w represents wind speed, and t represents temperature. The bottom figure shows the percentages of percepts generated during the perception cycle for different sets of observers (at different distances from the known blizzard) and for different orderings of quality types.
Knoesis center recently declared a center of excellence by Ohio governor
Get all sensors using well known location names – Problem to be solve
Knoesis center recently declared a center of excellence by Ohio governor
LOD Cloud is a way of sharing, exposing, and connecting pieces of data, information and knowledge on the Semantic Web using URI’s and RDFComprises of geographic, biological datasets etc
Linked Data explodes
Knoesis center recently declared a center of excellence by Ohio governor