Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Towards an Empirical Semantic Web Science: Knowledge Pattern Extraction and Usage
1. Towards an Empirical Semantic
Web Science: Knowledge Pattern
Extraction and Usage
Andrea Nuzzolese
Ph.D. Student
Università di Bologna
STLab, ISTC-CNR
2. Outline
• Empirical Semantic Web Science and Knowledge Patterns (KPs)
• A possible methodology for making KPs emerge from the Web of
Data
• The work done so far in KP extraction
• Evaluating KPs' efficacy through Exploratory Search
2
3. Does a Web science exist?
• A science usually is applied to clear research objects
✦ Physical and biological science analyzes the natural world, and tries to find
microscopic laws that, extrapolated to the macroscopic realm, would
generate the behavior observed
• The Web is an engineered space created through formally
specified languages and protocols
• Web pages with their content and links are created by humans
with a particular task governed by social conventions and laws
• A Web science exists [Berners-Lee Et Al., 2006] and is oriented
to:
✦ Growth of the engineered space;
✦ Human-web interaction patterns
3
4. What about a Web of Data science?
• Linked data offers huge data for empirical research
4
5. What are the research objects of the empirical
SW science?
• The Semantic Web and Linked data give us the chance to
empirically study what are the patterns in organizing and
representing knowledge
• The research objects of the Semantic Web as an empirical science
are Knowledge Patterns (KPs)
5
6. Knoweldge Patterns
• KPs are small well connected units of meaning, which are
✦ task based
✦ well grounded
✦ cognitively sound
• KPs find their theoretical grounding in frames
✦ “… a frame is a data-structure for representing a stereotyped
situation.” [Minsky 1975]
✦ “...the availability of global patterns of knowledge cuts down on non-determinacy
enough to offset idiosyncratic bottom-up input that might otherwise be
confusing.” [Beaugrande 1980]
6
8. Empirical Semantic Web and KPs
• KPs emerge from the knowledge soup deriving from the Web
• A methodology for KP extraction from the Web
8
9. KP extraction
• The Web is populated by heterogeneous sources
• We can classify sources in two categories
✦ Formal and semi-formal sources modeled by adopting a top-down approach
✴ e.g., foundational ontologies, frames, thesauri, etc.
✦ Non-formal sources modeled by adopting a bottom-up approach
✴ e.g., RDBs, Linked Data, Web pages, XML documents, etc.
• Our KP extraction methodology is based on two complementary
approaches
✦ A top-down approach
✦ A bottom-up approach
9
11. KP detection and discovery
• The top-down approach is aimed to extract KPs that already
exists in a formal or semi-formal structure
✦ Possible techniques: reengineering, refactoring based on association rules,
key concept identification, ontology mapping, etc.
• The bottom-up approach is aimed to extract to discover or detect
KPs from data
✦ Possible techniques: inductive techniques, machine learning, data mining,
ontology mining, etc.
11
12. KP validation
• The top-down and the bottom-up approaches concur in the
validation of KPs
• KP extraction is a matter of understanding how the world or
specific domains have been described from different perspectives
✦ The perspective of domain experts, ontologists, etc., which try to give
formalizations either of the world or of specific domains
✦ The perspective of users, data entries, etc, which effectively populate and
manage data that report facts about the world
• For example it would be cognitively relevant if an occurrence of
KP emerges both with the top-down and the bottom-up
approach
12
14. KP reengineering from FrameNet’s frames
• FrameNet is a cognitive sound lexical knowledge base, which is
grounded in a large corpus
• FrameNet consists of a set of frames, which have frame elements
lexical units, which pair words (lexemes) to frames, and relations
to corpus elements
✦ Each frame can be interpreted as a class of situations
14
19. KP discovery from Wikipedia links
• Hypothesis
✦ the types of linked resources that occur most often for a certain type of
resource constitute its KP
✦ since we expect that any cognitive invariance in explaining/describing things
is reflected in the wikilink graph, discovered KPs are cognitively sound
• Contribution
✦ an EKP discovery procedure
✦ 184 EKPs published in OWL2
19
21. Path popularity
Jackson_5
Dave_Grohl Michael_Jackson
Jackie_Jackson
Nirvana
Madonna
Prince
Charlie_Parker Keith_Jarrett
Foo Fighters Beatles
nSubjectRes(Pi,j)/nRes(Si)
John_Lennon
Paul_McCartney
21
22. Boundaries of KPs
• An KP(Si) is a set of paths, such that
Pi,j ∈ KP(Si) ! pathPopularity(Pi,j, Si) ≥ t
• t is a threshold, under which a path is not included in an KP
• How to get a good value for t?
22
23. Boundary induction
Step Description
1 For each path, calculate the path popularity
For each subject type, get the 40 top-ranked path popularity
2
values*
Apply multiple correlation (Pearson ρ) between the paths of all
3 subject types by rank, and check for homogeneity of ranks
across subject types
For each of the 40 path popularity ranks, calculate its mean
4
across all subject types
5 Apply k-means clustering on the 40 ranks
Decide threshold(s) based on k-means as well as other
6
indicators (e.g. FrameNet roles distribution)
23
25. How can be KPs evaluated and used?
• The evaluation of KPs should be performed in terms of their
capability to be cognitively sound in capturing and representing
knowledge
• A scenario that can be used as for evaluating the efficacy of KPs
is the exploratory search combined with user studies.
25
26. Why exploratory search?
• Exploratory search is characterized “by uncertainty about the space
being searched and the nature of the problem that motivates the
search” [White Et Al., 2005]
• KPs can be used for supporting exploratory search
✦ They can be used in order to filter knowledge by drawing a meaningful
boundary around the retrieved data
✦ They allow to suggest exploratory paths based on cognitive criteria of
relevance
• We can investigate how KPs help users in exploratory search
tasks
26
27. Aemoo: KP-based exploratory search
• A Web application that supports exploratory search on the Web
based on KPs extracted from Wikipedia links
• It aggregates knowledge from Linked Data, Wikipedia, Twitter and
Google News by applying KPs as knowledge lenses over data
• It provides an effective summary of knowledge about an entity,
including explanations
27
30. Conclusions
• We want to contribute to the realization of the Semantic Web as
an empirical science by providing a methodology for KP
extraction
• Our methodology for extracting KPs is based on two approaches
✦ a top-down approach
✦ a bottom-up approach
• We have seen our experience in KP extraction so far
✦ KPs from FrameNet’s frames
✦ KPs from Wikipedia links
• The evaluation we have in mind should be performed by means of
exploratory search tasks
✦ Aemoo
30