1. Analysis of texts to logical
contradictions
Ontologs LLC, Moscow, 2012
2. Motivation
50 years ago: 60th
The exponential growth of information flows in an era of scientific and
technological revolution
People no longer cope with the processing
Solution: machine processes data
Challenges of our time
Exponential growth of
knowledge available to man
It’s physically impossible to just
read the number of publications
on the appropriate category
Solution: machine must process
knowledge
3. Systematization of knowledge
Knowledge extraction
Sources - both structured and unstructured data
Unstructured data - texts in natural language
Intelligent processing of texts - an extraction of knowledge from them
The purpose of knowledge extraction - its automatic systematization
Methods of knowledge systematization
Text classification
Text summarization
Copyright analysis of texts
Sentiment analysis of texts
Problems
Symbol and lexical language levels
Dependence on the language of the text
Insufficient accuracy
The lack of processing context
4. Levels of language representation
Symbolic level Lexical level
Symbol classes: alphabetic characters, Language dictionaries.
spaces, punctuation, etc. Knowledge of inflection.
Use: The copyright analysis, determination of Use: The copyright analysis, text search, etc.
authorship, etc. “Brown fox jumps…”
“Symbol classes: alphabetic characters,
brown (adj.) fox (noun) jump (verb)
spaces, punctuation, etc.”
“symbolclasses” “alphabeticcharacters”
“spaces” “punctuation” “etc”
Syntax level Semantic level
Grammars. Elements of ontology.
Congruence. Congruence of ontology’s objects.
Use: a selection of "correct" phrases, Use: a selection of correct semantic
terminological analysis structures
“Brown fox jumps…” “Brown fox jumps…”
brown (quality) fox (subject) jump brown (color) fox (kind of mammal) jump
(predicate) (action, move)
5. Analysis on semantic level
Semantic structure of texts
Ontology – the formal knowledge description for human and machines
The system automatically extracts knowledge from a text document and creates an ontology
Ontology of text - language-independent machine representation of the semantic content of
the text
Tools for ontology manipulation
Languages: RDF (Resource Description
Framework) and OWL (Web Ontology
Language)
Libraries: Jena - open implementation
of RDF and OWL with the possibility of
inference
Ontology of a text - logical theory, written
in the language OWL (RDF)
6. Logical analysis of texts
Semantics and language constructs
Domain-based skeleton ontology - a conceptual framework
Conceptual scheme - a set of classes and relationships between classes
Elements of the conceptual schema associated with language expressions
Knowledge extratcion
Input: a text and an ontology with language
expressions
The text is analyzed and objects of an ontology
are allocated - instances of classes and relations
of the conceptual scheme
Logical analysis
A set of logical rules that define the conditions
for the correctness of ontology elements
Logic formulas to express the issues of
correctness
Verification procedure for the contradictions in
an ontology
7. Example – ontology of events
Ontology of events
Events - incidents, sports, meeting government officials, etc.
List of participants of the event contains information about
persons
Person class objects - instance “Vladimir Putin”
Language expressions for object class Person
“Vladimir Putin” First Name: string
Vladimir Putin Second Name: string
V. Putin Birth Date: date
Position: string
Vladimir Vladimirovich Putin
Location List: list of (place, date)
President Putin Spouse: string
Pesident of Russian Federation Children: list of sting
[time context: 7 of May, 2000 -7 of …
May, 2008, 7 of May 2012 - present]
8. Objects and facts extraction
White Papers
President of Russia Vladimir Putin and German Chancellor
held talks in Berlin 1 of June, 2012. Following the talks, Mr
Putin Ms Merkel made press statements and answered
journalists’ questions.
Putin : Person
The ontology of the text First Name: Vladimir
Second Name: Putin
Event class object – {“Working visit to Birth Date: 7 of October, 1952
Germany”, 01.06.2012, Participants: Position: President
(Vladimir Putin, Angela Merkel), …}
Location List: …
Person class object – {“Putin”, Spouse: …
“Vladimir”, 07.10.1952,…} Children: …
…
9. Identification of contradictions
Identification of contradictions on objects extraction phase
Suppose an article from the example indicated date: 1 of June 2011
Language expression ”President of Russian Vladimir Putin” is incorrect
because temporal context (1 of June 2011) contradicts the content of the
corresponding object class Person - Russian President [temporary context:
7 of May 2000 - 7 of May 2008, 7 of May 2012 - present]
Identification of contradictions on logical analysis phase
Suppose you have an article with the text “1 of June 2012 Vladimir Putin
isited to inspect the bridge to the island in the Russian city of Vladivostok”
Event class object is extracted. The object contradicts the fact in Event
ontology: “Working visit to Germany” because time of events is the same,
but places are different