Studying, understanding and exploiting the content of a digital library, and extracting useful information thereof, require automatic techniques that can effectively support the users. To this aim, a relevant role can be played by concept taxonomies. Unfortunately, the availability of such a kind of resources is limited, and their manual building and maintenance are costly and error-prone. This work presents ConNeKTion, a tool for conceptual graph learning and exploitation. It allows to learn conceptual graphs from plain text and to enrich them by finding concept generalizations. The resulting graph can be used for several purposes: finding relationships between concepts (if any), filtering the concepts from a particular perspective, keyword extraction and information retrieval. A suitable control panel is provided for the user to comfortably carry out these activities.
Developing a meta language in multidisciplinary research projects-the case st...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
1. Università degli studi di Bari “Aldo Moro”
Dipartimento di Informatica
ConNeKTion: A Tool for Exploiting Conceptual
Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella
L.A.C.A.M. {fabio.leuzzi, stefano.ferilli, fulvio.rotella}@uniba.it
http://lacam.di.uniba.it
9th Italian Research Conference on Digital Libraries
Università la Sapienza - Rome, Italy
January 31 - February 1, 2013
2. Overview
● Introduction & Objectives
● Tool overview
● Knowledge Representation Formalism
● Relevant concepts
● Information Retrieval
● Reasoning by Association
● Exploiting Tool
● Conclusions & Future Works
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 2
3. Introduction
Some repositories leave the responsibility of quality to the authors.
+
Anybody can produce and distribute documents.
=
Possible low average quality of the repository contents.
The study, understanding and exploitation of the content of a digital library,
with the aim to easily explore the semantic content of huge amounts of text.
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 3
4. Introduction
Possible solution:
● Natural Language Processing systems
● Provide the grammatical structures contained in text
● Knowledge Representation formalisms
● Semantic networks
● Graph learning techniques
● To obtain a semantic network starting from the text
● In order to satisfy the information needs, the knowledge base
can be exploited:
● To make summarizations
● To reason with it
● ...
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 4
5. Objectives
Improving fruition of a DL
● Use of a tool providing advanced functionalities
● Mixed strategy for relevant concept recognition
● Semantic approach to information retrieval
● Automatic inference over the acquired knowledge
● Reasoning by association
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 5
6. Tool overview
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 6
7. Knowledge representation
formalism
Only subject, verb and complement have been considered.
● Subjects and complements → concepts
● Verbs → relations between them
subject, subject,
verb, complement
complement
The frequency of arcs between the concepts in positive and negative
sentences has been taken into account.
● Enrich the representation formalism
● Give robustness to our solution through a statistical approach
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 7
8. Relevant Concepts
● Relevant nodes are sought in the graph
● Mixed strategy
● Semantic network structure
● EM clustering provided by Weka
● Keyword Extraction
● Quantitative approach based on co-occurrences
● Qualitative approach exploiting WordNet
● Psychological approach based on principles of an effective
presentation
● Components empirically weighted
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 8
9. Information Retrieval
● Word Sense Disambiguation
● One Domain per Discourse assumption: many uses of a word in a
coherent portion of text tend to share the same domain
● Prevalent domain individuation
● Extraction of all synsets for each term
● Extraction of all domains for each synset
● Choice of prevalent domain synset
● Pairwise Complete Link Agglomerative Clustering
● Each synset generates a singleton cluster
● For each pair of clusters
● If the complete link property holds
● Merge the involved clusters
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 9
10. Information Retrieval
● Multi-strategy Similarity Measure on WordNet
● 3 components summed and normalized in ]0,1[
● depth (ancestors)
● breadth (direct neighbors)
● breadth (inverse neighbors)
● Document Partitioning
● For each document
● Each synset votes for a cluster
● User Query Processing
● Brute force WSD to find the best synsets combination
● Best combination used to return a ranked list of clusters
● Each cluster has a list of related documents obtained by the Document
Partitioning phase
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 10
11. Reasoning ‘by association’
Breadth-First Search
Given two nodes (concepts), a Breadth-First Search starts from
both nodes, the former searches the latter's frontier and vice
versa, until the two frontiers meet. Then the path is restored
going backward to the roots in both directions.
We also provide the number of positive/negative instances, and
the corresponding ratios over the total to help understanding
different gradations (permitted, prohibited, typical, rare, etc.) of
actions between two objects.
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 11
12. Reasoning ‘by association’
Breadth-First Search
The table below shows a sample of possible outcomes.
E.g., an interpretation of case 1 can be:
“the young looks television that talks about (and criticizes) facebook,
because it typically does not help (rather distracts) schoolwork”.
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 12
13. Reasoning ‘by association’
Probabilistic approach
Real world data are typically noisy
and uncertain → need for strategies
that soften the classical rigid logical
reasoning
Defined a formalism based on ProbLog language: pi :: fi
● fi : ground literal of the form link (subject, verb, complement)
● pi : ratio between the
sum of all examples for
which fi holds and the sum
of all possible links between
subject and complement
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 13
14. ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 14
15. ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 15
16. ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 16
17. ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 17
18. ConNeKTion
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 18
19. Conclusions
ConNeKTion allows to learn conceptual graphs from plain text and to
enrich them by finding concept generalizations.
The resulting graph can be used for several purposes:
● finding relationships between concepts (if any)
● filtering the concepts from a particular perspective
● relevant concepts recognition and information retrieval
A suitable control panel is provided for the user to comfortably carry out
these activities.
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 19
20. Future Works
We plan to improve the natural language text pre-processing using anaphora
resolution in order to replace, where possible, pronouns with the explicit concept
they express.
All functionalities have parameters set empirically. A criteria for automatical
setting of suitable parameters is needed.
The preseted functionalities are based on the exploitation of WordNet. A strategy
to make the operators WordNet free can be desirable.
We also wish to extend the reasoning operators by adding an argumentation
operator, that could exploit probabilistic weights, intended as a rate of reliability,
to provide support or attack to a given statement.
ConNeKTion: A Tool for Exploiting Conceptual Graphs Automatically Learned from Text
F. Leuzzi, S. Ferilli, F. Rotella 20