SlideShare a Scribd company logo
1 of 4
Download to read offline
The 6th
IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications
15-17 September 2011, Prague, Czech Republic
Selection and Aggregation of Sentences
in the Knowledge Formation Process
M.S. Shibut, V.S. Yakovishin
The Academy of Public Administration under the aegis of the President of the Republic of Belarus,
17, Moskovskaya Str., 220007, Minsk, Republic of Belarus, m_shibut@pac.by, http://pac.by/en
Abstract—The presented method is based on the use of
the special formal language. In the formal language, all
sentence structures are expressed as sets of syntactic
elements, syntagmes, which allows us to reduce the semantic
identification of sentences (their selection and aggregation)
to the use of set-theoretical inclusion. Input text sentences
are at first transformed into set-theoretical form, then the
resulting formal sentence structures are selected and united
into growing knowledge representations.
The integration of the sentences that have one and the
same subject (a noun phrase contained in user’s request) is
considered as a subject knowledge representation; and then
any collection of the subject knowledge representations
produced in the knowledge formation process is considered
as a user-oriented (highly tailored) description of subject
field.
Keywords—formal language, knowledge formation,
semantics, subject field, subject-knowledge representation,
syntax
I. INTRODUCTION
The knowledge formation is here presented as the
process of selection and aggregation of input sentences. In
this process, the text sentences are at first transformed into
the formal language, and then they are integrated into the
knowledge representation [1, 2]. The integration of the
sentences that have one and the same subject will be
considered as a subject knowledge representation, and any
collection of the subject knowledge representations,
produced in the knowledge formation process, will be
considered as a user-oriented (“highly tailored”)
description of subject field.
In every text sentence, the subject (usually
characterized as “the something or someone that the
sentence is about”, “the thing being talked about”) is
expressed by a grammatically separated noun phrase that
represents either the absolutely independent part of
sentence (the formal subject of the division subject-
predicate) or the general determinative part [3], i.e. the
attribute that relates to the whole sentence (the actual
subject of the division theme-rheme, also known as topic-
comment, representing the “reflection of the speaker’s
attitude towards what is said”). Both the formal subject
and the actual subject are always expressed by special
grammatical means. In many languages, the formal
subject occupies the main (first) position in sentences;
there are in addition special syntactically neutral word
forms to express the formal subject, namely, the noun
form of the nominative case (“casus indefinites”). The
actual subject (“theme” or “topic”) can be either coincided
with the formal subject or marked by extra actualization
means (such as special particles, inverted word order).
In the knowledge formation process, each required
noun phrase becomes the formative subject role. Some of
the noun phrases (contained in user’s request) can be
specially actualized (and elevated thereby to the subject
rank) according to user’s request information.
The presented here knowledge formation method is
based on the using of the special formal language. In the
formal language, input text sentences are expressed in the
set-theoretical (parenthesis-free, “discrete”) form as sets
of their syntactic elements (syntagmes), which allows us
to reduce the semantic identification of sentences to the
using of standard set-theoretical relation of inclusion.
The set-theoretical form and integration of input
sentences into the subject-knowledge representation will
be considered below.
II. FORMAL LANGUAGE
We proceed from the assumption that the formal
language corresponds to a dibasic algebra with a set of
words and a set of sentences. The set of words represents
a free semi-group over the basic alphabet. The set of
sentences represents a ring-like algebra with one unary
operation and a pair of binary operations - coordination
and determination: the coordination (“addition”) is
commutative and associative; the determination
(“multiplication”) is non-commutative, non-associative,
and one-sided (left-hand) distributive over coordination
[3].
There is now a need for some set of special, auxiliary,
symbols used to represent given algebraic operations.
Then the formal language (L) is in general defined as a
set of sentences derived over a set of words by means of
algebraic operations, i.e.
L ⊆ L(A*∪Ω),
where A* is a set of words over a basic alphabet A; Ω is
an auxiliary alphabet, i.e. a set of symbols of algebraic
operations (A∩Ω=∅).
So the formal language implies the use of both words
and operation symbols. The words and the operation
symbols can serve as a syntactic basis in expression of
two distinct types of semantic elements, namely, lexical
and grammatical (functional) meanings. The words are
used for expression of lexical meanings, and the
operation symbols are used for expression of grammatical
meanings: the symbol of unary operation represents
general grammatical meanings of sentences (modality,
negation, question, exclamation, etc.), and the symbols of
binary operations represent functional meanings of
sentence parts. Thus, two description levels are clearly
defined in the formal language: the syntactic level
represents purely abstract (algebraic) sentence structures
(in compliance with distinctive features of the given
algebraic operations) and the semantic level represents
the sense interpretation of the algebraic structures.
The syntax that answers to the above mentioned
algebraic operations can be represented by the following
set of rules:
S→ ◊S | (X∇X) ⎜(XΔS)⎜X,
X→ X∇X ⎜(XΔS),
where S (“sentence”) is the initial symbol; X (“word”) is
the start symbol for word derivation; ◊ (“modality”), ∇
(“coordination”), Δ (“determination”) are the operation
symbols.
In the syntactic rules, the commutative and
associative coordination is expressed as the atomic
formula X∇X in which the same symbol is used for both
members in parentheses-free notation. The coordination
in the rule with parentheses S→(X∇X) is necessary for
expression of the left-distributive property of
determination over coordinated row in strings like
XΔ(X∇X…). The non-commutative and non-associative
determination is expressed as an obligatory use of the
different symbols and parentheses in the rules S→(XΔS),
X→(XΔS). The parentheses must certainly be used to
indicate the “evaluation” order of the non-associative
operation. And so the atomic formula (XΔS) is
represented twice: in case S→(XΔS), the operation Δ is
“evaluated” from right to left as in (XΔ(XΔS…)); in case
X→(XΔS), it is “evaluated” from left to right as in
((…XΔS)ΔS).
The sentence syntagmatic structures are now
expressed in the explicit form: every independent (head)
member of the generated syntagmatic structure occupies
the first position, and the dependent (defining, non-head)
member is connected by means of determination symbol
in the second position. So the simplest syntagmatic
structure, the syntagme, has the form of a two-word string
like (X1ΔX2), where X1 (the first word) is the independent
(head) member; ΔX2 (the second word) is the
determinative (dependent) member; X1, X2 are words
representing lexical meanings; Δ (determination) is a
symbol indicating a syntactic (functional) meaning of the
sentence part. Note that the words X1,X2∈A* can take an
empty string value, and then the initial syntagme (X1ΔX2)
can be shorted either to a head member X1, representing a
reduced syntagme, or to a determinative member ΔX2,
representing an elliptical syntagme.
Thus, syntagmatic structures can be expressed in the
standard algebraic form using the parenthesis notation
with the fixed order. As will be shown below, it is also
possible to obtain an alternative, set-theoretical
(parenthesis-free), form of syntagmatic structures. The
proposed here knowledge representation uses the set-
theoretical form, in which every syntagmatic structure is
expressed as a set that contains all the given syntagmes as
usual set’s elements.
III. SET-THEORETICAL FORM OF KNOWLEDGE
REPRESENTATION
The set-theoretical form of knowledge representation
is indeed sufficient for explicit expression of all the
differences of syntagmatic structures, such as the stepwise
and collateral subordinations, the homogeneous parts, the
absolutely independent part, and the common depended
pert (determinative), see [3]. The following identical
syntagmatic structures are given by both parenthesis and
set-theoretical notations. In the both notations, all
differences of syntactical relations are explicit expressed.
Subordination types. The stepwise subordination is
expressed as a syntagmatic structure where the dependent
member of the previous syntagme serves as the head
member of the consequent syntagme (as in the book of the
new author):
(X1Δ1(X2Δ2X3))={X1Δ1X2, X2Δ2X3},
where X2 is both a dependent member of syntagme X1Δ1X2
and a head member of syntagme X2Δ2X3. In the collateral
(parallel) subordination, several dependent members are
connected with their common head member (as in the new
book of the author):
((X1Δ1X2)Δ2X3) = {X1Δ1X2, X1Δ2X3},
where X1 is a head member; Δ1X2, Δ2X3 are dependent
members.
Homogeneous parts can be represented as several
dependent members that are subordinated to a single head
member. The determination sign represents in that case a
common functional meaning of the parts of the sentence,
whereas coordination sign serves as a means of juncture
together the identical parts by extracting the common
functional meaning outside the brackets:
(X1Δ(X2∇X3))={X1ΔX2, X1ΔX3},
where ∇ is the symbol of coordination; ΔX2, ΔX3 are
homogeneous parts.
Absolutely independent part is a single
syntagmatically unmarked word, which does not have its
governing member and therefore it does not contain the
determination sign denoting the meaning of the part of the
sentence. For the sake of simplicity of sentence
identification, the absolutely independent part will be
emphasized in the set-theoretical form as a separate set’s
element, e.g.:
((X1Δ1X2)Δ2X3) = {X1, X1Δ1X2, X1Δ2X3},
where an absolutely independent part X1 repeats itself in
the form of a reduced syntagme.
Determinative. There exists a possibility to express a
common dependent member, called the determinative, i.e.
the attribute that relates to the whole sentence. In the set-
theoretical form, the determinative can be emphasized as a
separate set’s element. In contrast to the emphasized
absolutely independent part, the determinative contains a
determination symbol, i.e. it is represented as an elliptical
syntagme, e.g.:
((X1Δ1(X2Δ2X3))Δ3X4) =
= {Δ3X4, X1, X1Δ1X2, X2Δ2X3},
where Δ3X4 is a determinative (e.g., In the evening, he
reads a book); cp., with
(X1Δ1(X2Δ2X3)Δ3X4) =
= {X1, X1Δ1X2, X2Δ2X3, X2Δ3X4},
where Δ3X4 is an adverbial modifier (as in He walked in
the garden in the evening).
In the formal descriptions of sentences, the lexical
meanings can be expressed by usual word stems, while the
meanings of the sentence parts are in need of conventional
signs – such as a (attribute), p (predicate), pt (predicate in
past indefinite tense) o (direct object), in (adverbial
modifier of place, “inside”), etc. It is convenient for
clearness to separate the words and the conventional signs
by the punctuation marks (the underscore character and
the dot). So the real sentence, e.g., The young man reads a
book in the garden, can be expressed by the following set
of syntagmes:
man_a.young ‘[the/a] young man’,
man_p.read ‘[the/a] man reads’,
read_o.book ‘[to] read [the/a] book’,
read_in.garden ‘[to] read in [the/a] garden’.
Thus, one can suppose that any sentence can be
represented as a certain set of syntagmes. It is also
assumed that any sentence consists of at least one noun
phrase (a noun or a noun with several its modifiers). Then
the integration of sentences that have one and the same
subject, i.e. a noun phrase contained in user’s request
information, can be considered as subject knowledge
description (representation).
So subject knowledge description can be defined as a
set σ(N) of sentences S1, S2, … that contain the common
subject, represented by a noun phrase N, i.e.
σ(N)={S ⊇ N | S is a sentence};
and then any collection of subject knowledge descriptions
produced in the knowledge formation process is a (user-
oriented) subject field description, i.e.
σ(N1, N2,…) = {σ(N1), σ(N2), …},
where σ(N1, N2,…) is a subject field in which N1, N2,…
are noun phrases that play the role of subjects.
Note that the subjects can be represented by a noun
phrase expressed as the absolutely independent part, i.e.
the formal subject of the division “subject-predicate”, or
as the determinant, i.e. the actual subject of the division
“theme-rheme”. In the text sentence, actual subject can be
either coincided with the formal subject or marked by
extra actualization means. In the knowledge formation
process, each noun phrase, contained in user’s request, can
become the formative subject role, i.e. some of the noun
phrases can be specially actualized (and elevated thereby
to the subject role) by special means (e.g., by the inverted
word order) according to information questions.
IV. RULES OF SUBJECT-KNOWLEDGE FORMATION
Subject knowledge formation is a growth process in
which two formation rules, namely the rules of selection
and aggregation of sentences, must realize.
The first of the rules permits to make a selection of all
the more intensionally informative sentences by means of
elimination of the sentences that are less informative than
another sentence. The intensional superiority is defined
considering the inclusion between the sets: one of the
sentences S1 and S2 must be eliminated, if it is a subset of
another sentence. That is, the rule can be formalized as
follows:
{S1, S2}→S1, if S1 ⊇ S2 – selection rule.
The second rule realizes the integration in a collection
of already selected sentences. So, if S1, S2, … are
sentences that have the same subject N (a noun phrase
contained in user’s request), they will unite in common
subject knowledge description:
{S1, S2, …}→ σ(N) – aggregation rule.
The following examples will illustrate the selection
and aggregation processes. (We restrict, for the sake of
simplicity, examples to denote only simple sentences.)
Let S1, S2, S3, S4, S5 be sentences, expressed in terms of
formal language, such as
S1 = {man, man_a.young, man_p.read,
read_o.book}
‘The young man reads a book’
S2 = {man, man_a.young, man_p.read,
read_o.book, read_in.library}
‘The young man reads a book in the library’
S3 = {man, man_pt.walk, walk_in.park}
‘The man walked in the park’
S4 = {library, library_pPs. situate,
situate_in.street, street_a. graceful}
‘The library is situated in a graceful street’
S5 = {man, man_a.young, man_pt.kick,
kick_o.ball}
‘The young man kicked the ball’
where a, in, o are signs of the secondary sentence parts; p,
pt, pPs are signs of the different predicates (for the
present, past indefinite, and present simple passive,
respectively).
According to the selection rule, the first sentence must
be eliminated because of intensional superiority of the
second sentence (S1 ⊆ S2).
The sentences S2, S3, S4, S5 can be integrated in
compliance with the aggregation rule. Let “man”, “young
man”, “library” be the subjects contained in user’s request.
Then, as a result of integration on the given subjects, the
following three subject knowledge descriptions can be
obtained:
σ ({man}) = { S2, S3, S5}
σ ({man, man_a.young}) = { S2, S5}
σ ({library}) = { S2, S4}
Note that the noun phrase “library” contained in S2
must be actualized (by the inverted word order), i.e.
sentence S2 will be transformed into the actual division
form: In the library, the young man reads a book.
V. PROSPECTS OF APPLICATION
Realization of the subject knowledge formation
process (and creation of various knowledge-based
systems) makes it possible to obtain effective solutions of
a whole number of pressing problems. In particular, the
following is noteworthy.
Knowledge-based text adaptation. The subject
knowledge description produced in the formation process
can be used as a basis for automatic creation (synthesis) of
adapted (user-oriented) text materials - such as
information-analytical reviews, electronic textbooks,
individual teaching materials [1].
Knowledge-based information search. The
information search with great precision can be realized as
a two-stage process (that resembles the ore processing): in
the first stage of the process (data search, “ore mining”),
the usual information retrieval is realized to draw
information (as full as possible) from a number of sources
(that contains valuable elements); in the second stage
(knowledge search, “ore dressing”), the obtained results
are processed to extract only the important information
(knowledge, valuable elements).
Knowledge-based machine translation. In the
translation of the source text from one natural language to
another, the subject knowledge base (where the lexical
compatibility is fixed) can be used as a supporting
interlingua, that plays the role of an effective filter for
screening all the misplaced meanings of polysemous
words.
REFERENCES
[1] M.S. Shibut, V.S. Yakovishin Method for creating customized
training materials based on the processing of electronic
information resources, Proceedings of the conference "Applied
Linguistics in science and education", dev. memory of Professor.
R.G. Piotrowski, St. Petersburg, 25-26 March 2010. -
St. Petersburg, "Lem", 2010. - P. 339 – 345 (in Russian)
[2] M.S. Shibut, V.S. Yakovishin Recognition of grammatical
information in the process of linguistic knowledge, Topical
Problems of Theoretical and Applied Linguistics: Proceedings of
the Intern. scientific. Conf. dev. memory of Professor.
R.G. Piotrowski, Minsk, 15-16 June 2010, Part 2. - Minsk, 2010. -
P. 143-147 (in Russian)
[3] V.S. Yakovishin Algebraic representation of syntagmatic
structures, Web Journal of Formal, Computational & Cognitive
Linguistics, Issue 11, 2009 [Electronic resource]. – Mode of
access: http//fccl.ksu.ru/issue11.

More Related Content

What's hot

3. Relational Models in DBMS
3. Relational Models in DBMS3. Relational Models in DBMS
3. Relational Models in DBMSkoolkampus
 
Eer >r.model
Eer >r.modelEer >r.model
Eer >r.modellavya3
 
ER model to Relational model mapping
ER model to Relational model mappingER model to Relational model mapping
ER model to Relational model mappingShubham Saini
 
ER DIAGRAM TO RELATIONAL SCHEMA MAPPING
ER DIAGRAM TO RELATIONAL SCHEMA MAPPING ER DIAGRAM TO RELATIONAL SCHEMA MAPPING
ER DIAGRAM TO RELATIONAL SCHEMA MAPPING ARADHYAYANA
 
Chapter 6 relational data model and relational
Chapter  6  relational data model and relationalChapter  6  relational data model and relational
Chapter 6 relational data model and relationalJafar Nesargi
 
Chapter 6 relational data model and relational
Chapter  6  relational data model and relationalChapter  6  relational data model and relational
Chapter 6 relational data model and relationalJafar Nesargi
 
ER Digramms by Harshal wagh
ER Digramms by Harshal waghER Digramms by Harshal wagh
ER Digramms by Harshal waghharshalkwagh999
 
Recognition of Farsi Handwritten Numbers Using the Fuzzy Method
Recognition of Farsi Handwritten Numbers Using the Fuzzy MethodRecognition of Farsi Handwritten Numbers Using the Fuzzy Method
Recognition of Farsi Handwritten Numbers Using the Fuzzy MethodCSCJournals
 
Relational Model - An Introduction
Relational Model - An IntroductionRelational Model - An Introduction
Relational Model - An IntroductionRajeev Srivastava
 
The relational data model part[1]
The relational data model part[1]The relational data model part[1]
The relational data model part[1]Bashir Rezaie
 
Fundamentals of database system - Relational data model and relational datab...
Fundamentals of database system  - Relational data model and relational datab...Fundamentals of database system  - Relational data model and relational datab...
Fundamentals of database system - Relational data model and relational datab...Mustafa Kamel Mohammadi
 
Relational Databases 2
Relational Databases 2Relational Databases 2
Relational Databases 2Jason Hando
 
Dbms 7: ER Diagram Design Issue
Dbms 7: ER Diagram Design IssueDbms 7: ER Diagram Design Issue
Dbms 7: ER Diagram Design IssueAmiya9439793168
 

What's hot (20)

3. Relational Models in DBMS
3. Relational Models in DBMS3. Relational Models in DBMS
3. Relational Models in DBMS
 
Normalisation
NormalisationNormalisation
Normalisation
 
Eer >r.model
Eer >r.modelEer >r.model
Eer >r.model
 
Relational model
Relational modelRelational model
Relational model
 
02er
02er02er
02er
 
ER model to Relational model mapping
ER model to Relational model mappingER model to Relational model mapping
ER model to Relational model mapping
 
ER DIAGRAM TO RELATIONAL SCHEMA MAPPING
ER DIAGRAM TO RELATIONAL SCHEMA MAPPING ER DIAGRAM TO RELATIONAL SCHEMA MAPPING
ER DIAGRAM TO RELATIONAL SCHEMA MAPPING
 
Chapter 6 relational data model and relational
Chapter  6  relational data model and relationalChapter  6  relational data model and relational
Chapter 6 relational data model and relational
 
Chapter 4
Chapter 4Chapter 4
Chapter 4
 
Chapter 6 relational data model and relational
Chapter  6  relational data model and relationalChapter  6  relational data model and relational
Chapter 6 relational data model and relational
 
Mapping
MappingMapping
Mapping
 
ER Digramms by Harshal wagh
ER Digramms by Harshal waghER Digramms by Harshal wagh
ER Digramms by Harshal wagh
 
Recognition of Farsi Handwritten Numbers Using the Fuzzy Method
Recognition of Farsi Handwritten Numbers Using the Fuzzy MethodRecognition of Farsi Handwritten Numbers Using the Fuzzy Method
Recognition of Farsi Handwritten Numbers Using the Fuzzy Method
 
Relational Model - An Introduction
Relational Model - An IntroductionRelational Model - An Introduction
Relational Model - An Introduction
 
Build intuit
Build intuitBuild intuit
Build intuit
 
The relational data model part[1]
The relational data model part[1]The relational data model part[1]
The relational data model part[1]
 
Fundamentals of database system - Relational data model and relational datab...
Fundamentals of database system  - Relational data model and relational datab...Fundamentals of database system  - Relational data model and relational datab...
Fundamentals of database system - Relational data model and relational datab...
 
dbms er model
dbms er modeldbms er model
dbms er model
 
Relational Databases 2
Relational Databases 2Relational Databases 2
Relational Databases 2
 
Dbms 7: ER Diagram Design Issue
Dbms 7: ER Diagram Design IssueDbms 7: ER Diagram Design Issue
Dbms 7: ER Diagram Design Issue
 

Similar to Shibut i11 168 8ee9e0ed-cr

Language and its components
Language and its componentsLanguage and its components
Language and its componentsMIMOUN SEHIBI
 
Word meaning, sentence meaning, and syntactic meaning
Word meaning, sentence meaning, and syntactic  meaningWord meaning, sentence meaning, and syntactic  meaning
Word meaning, sentence meaning, and syntactic meaningNick Izquierdo
 
Grammar ii (unit 1)
Grammar ii (unit 1)Grammar ii (unit 1)
Grammar ii (unit 1)rominacheme
 
Systemic Functional Grammar
Systemic Functional Grammar Systemic Functional Grammar
Systemic Functional Grammar Sugeng Hariyanto
 
Talmy lexicalizationpatterns
Talmy lexicalizationpatternsTalmy lexicalizationpatterns
Talmy lexicalizationpatternsBrendaWongUdye
 
The three level approach to syntax
The three level approach to syntaxThe three level approach to syntax
The three level approach to syntaxKet Mai
 
The LSA breaks downanalyzes what constitutes a good and bad a.docx
The LSA breaks downanalyzes what constitutes a good and bad a.docxThe LSA breaks downanalyzes what constitutes a good and bad a.docx
The LSA breaks downanalyzes what constitutes a good and bad a.docxarnoldmeredith47041
 
Theoretical concepts in Syntax
Theoretical concepts in SyntaxTheoretical concepts in Syntax
Theoretical concepts in SyntaxDr. Mohsin Khan
 
PPT _ Introduction to Syntax.pptx
PPT _ Introduction to Syntax.pptxPPT _ Introduction to Syntax.pptx
PPT _ Introduction to Syntax.pptxLamhotNaibaho3
 
Point-free foundation of Mathematics
Point-free foundation of MathematicsPoint-free foundation of Mathematics
Point-free foundation of MathematicsMarco Benini
 
Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Dhabal Sethi
 
Thesaurus ppt.pptx
Thesaurus ppt.pptxThesaurus ppt.pptx
Thesaurus ppt.pptxApurvaShyam1
 
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...FACE
 
I am kind of confused about quantifiers. I am not sure how to transl.pdf
I am kind of confused about quantifiers. I am not sure how to transl.pdfI am kind of confused about quantifiers. I am not sure how to transl.pdf
I am kind of confused about quantifiers. I am not sure how to transl.pdfAMITPANCHAL154
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model IntroductionNishant Munjal
 
THE DIFFERENCES BETWEEN SYNTAX AND SEMANTICS
THE DIFFERENCES BETWEEN SYNTAX AND SEMANTICSTHE DIFFERENCES BETWEEN SYNTAX AND SEMANTICS
THE DIFFERENCES BETWEEN SYNTAX AND SEMANTICSHENOK SHIHEPO
 
AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...
AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...
AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...kevig
 
Unit-4-Knowledge-representation.pdf
Unit-4-Knowledge-representation.pdfUnit-4-Knowledge-representation.pdf
Unit-4-Knowledge-representation.pdfHrideshSapkota2
 

Similar to Shibut i11 168 8ee9e0ed-cr (20)

Language and its components
Language and its componentsLanguage and its components
Language and its components
 
Word meaning, sentence meaning, and syntactic meaning
Word meaning, sentence meaning, and syntactic  meaningWord meaning, sentence meaning, and syntactic  meaning
Word meaning, sentence meaning, and syntactic meaning
 
Grammar ii (unit 1)
Grammar ii (unit 1)Grammar ii (unit 1)
Grammar ii (unit 1)
 
Systemic Functional Grammar
Systemic Functional Grammar Systemic Functional Grammar
Systemic Functional Grammar
 
Phrase structure grammar
Phrase structure grammarPhrase structure grammar
Phrase structure grammar
 
Talmy lexicalizationpatterns
Talmy lexicalizationpatternsTalmy lexicalizationpatterns
Talmy lexicalizationpatterns
 
The three level approach to syntax
The three level approach to syntaxThe three level approach to syntax
The three level approach to syntax
 
The LSA breaks downanalyzes what constitutes a good and bad a.docx
The LSA breaks downanalyzes what constitutes a good and bad a.docxThe LSA breaks downanalyzes what constitutes a good and bad a.docx
The LSA breaks downanalyzes what constitutes a good and bad a.docx
 
Theoretical concepts in Syntax
Theoretical concepts in SyntaxTheoretical concepts in Syntax
Theoretical concepts in Syntax
 
PPT _ Introduction to Syntax.pptx
PPT _ Introduction to Syntax.pptxPPT _ Introduction to Syntax.pptx
PPT _ Introduction to Syntax.pptx
 
Point-free foundation of Mathematics
Point-free foundation of MathematicsPoint-free foundation of Mathematics
Point-free foundation of Mathematics
 
Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)Ijarcet vol-3-issue-3-623-625 (1)
Ijarcet vol-3-issue-3-623-625 (1)
 
Thesaurus ppt.pptx
Thesaurus ppt.pptxThesaurus ppt.pptx
Thesaurus ppt.pptx
 
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...
Abstract Symbolic Automata: Mixed syntactic/semantic similarity analysis of e...
 
Systemic functional linguistics
Systemic functional linguisticsSystemic functional linguistics
Systemic functional linguistics
 
I am kind of confused about quantifiers. I am not sure how to transl.pdf
I am kind of confused about quantifiers. I am not sure how to transl.pdfI am kind of confused about quantifiers. I am not sure how to transl.pdf
I am kind of confused about quantifiers. I am not sure how to transl.pdf
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model Introduction
 
THE DIFFERENCES BETWEEN SYNTAX AND SEMANTICS
THE DIFFERENCES BETWEEN SYNTAX AND SEMANTICSTHE DIFFERENCES BETWEEN SYNTAX AND SEMANTICS
THE DIFFERENCES BETWEEN SYNTAX AND SEMANTICS
 
AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...
AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...
AUTOMATIC ARABIC NAMED ENTITY EXTRACTION AND CLASSIFICATION FOR INFORMATION R...
 
Unit-4-Knowledge-representation.pdf
Unit-4-Knowledge-representation.pdfUnit-4-Knowledge-representation.pdf
Unit-4-Knowledge-representation.pdf
 

Recently uploaded

Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxSimeonChristian
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 

Recently uploaded (20)

Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 

Shibut i11 168 8ee9e0ed-cr

  • 1. The 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications 15-17 September 2011, Prague, Czech Republic Selection and Aggregation of Sentences in the Knowledge Formation Process M.S. Shibut, V.S. Yakovishin The Academy of Public Administration under the aegis of the President of the Republic of Belarus, 17, Moskovskaya Str., 220007, Minsk, Republic of Belarus, m_shibut@pac.by, http://pac.by/en Abstract—The presented method is based on the use of the special formal language. In the formal language, all sentence structures are expressed as sets of syntactic elements, syntagmes, which allows us to reduce the semantic identification of sentences (their selection and aggregation) to the use of set-theoretical inclusion. Input text sentences are at first transformed into set-theoretical form, then the resulting formal sentence structures are selected and united into growing knowledge representations. The integration of the sentences that have one and the same subject (a noun phrase contained in user’s request) is considered as a subject knowledge representation; and then any collection of the subject knowledge representations produced in the knowledge formation process is considered as a user-oriented (highly tailored) description of subject field. Keywords—formal language, knowledge formation, semantics, subject field, subject-knowledge representation, syntax I. INTRODUCTION The knowledge formation is here presented as the process of selection and aggregation of input sentences. In this process, the text sentences are at first transformed into the formal language, and then they are integrated into the knowledge representation [1, 2]. The integration of the sentences that have one and the same subject will be considered as a subject knowledge representation, and any collection of the subject knowledge representations, produced in the knowledge formation process, will be considered as a user-oriented (“highly tailored”) description of subject field. In every text sentence, the subject (usually characterized as “the something or someone that the sentence is about”, “the thing being talked about”) is expressed by a grammatically separated noun phrase that represents either the absolutely independent part of sentence (the formal subject of the division subject- predicate) or the general determinative part [3], i.e. the attribute that relates to the whole sentence (the actual subject of the division theme-rheme, also known as topic- comment, representing the “reflection of the speaker’s attitude towards what is said”). Both the formal subject and the actual subject are always expressed by special grammatical means. In many languages, the formal subject occupies the main (first) position in sentences; there are in addition special syntactically neutral word forms to express the formal subject, namely, the noun form of the nominative case (“casus indefinites”). The actual subject (“theme” or “topic”) can be either coincided with the formal subject or marked by extra actualization means (such as special particles, inverted word order). In the knowledge formation process, each required noun phrase becomes the formative subject role. Some of the noun phrases (contained in user’s request) can be specially actualized (and elevated thereby to the subject rank) according to user’s request information. The presented here knowledge formation method is based on the using of the special formal language. In the formal language, input text sentences are expressed in the set-theoretical (parenthesis-free, “discrete”) form as sets of their syntactic elements (syntagmes), which allows us to reduce the semantic identification of sentences to the using of standard set-theoretical relation of inclusion. The set-theoretical form and integration of input sentences into the subject-knowledge representation will be considered below. II. FORMAL LANGUAGE We proceed from the assumption that the formal language corresponds to a dibasic algebra with a set of words and a set of sentences. The set of words represents a free semi-group over the basic alphabet. The set of sentences represents a ring-like algebra with one unary operation and a pair of binary operations - coordination and determination: the coordination (“addition”) is commutative and associative; the determination (“multiplication”) is non-commutative, non-associative, and one-sided (left-hand) distributive over coordination [3]. There is now a need for some set of special, auxiliary, symbols used to represent given algebraic operations. Then the formal language (L) is in general defined as a set of sentences derived over a set of words by means of algebraic operations, i.e. L ⊆ L(A*∪Ω),
  • 2. where A* is a set of words over a basic alphabet A; Ω is an auxiliary alphabet, i.e. a set of symbols of algebraic operations (A∩Ω=∅). So the formal language implies the use of both words and operation symbols. The words and the operation symbols can serve as a syntactic basis in expression of two distinct types of semantic elements, namely, lexical and grammatical (functional) meanings. The words are used for expression of lexical meanings, and the operation symbols are used for expression of grammatical meanings: the symbol of unary operation represents general grammatical meanings of sentences (modality, negation, question, exclamation, etc.), and the symbols of binary operations represent functional meanings of sentence parts. Thus, two description levels are clearly defined in the formal language: the syntactic level represents purely abstract (algebraic) sentence structures (in compliance with distinctive features of the given algebraic operations) and the semantic level represents the sense interpretation of the algebraic structures. The syntax that answers to the above mentioned algebraic operations can be represented by the following set of rules: S→ ◊S | (X∇X) ⎜(XΔS)⎜X, X→ X∇X ⎜(XΔS), where S (“sentence”) is the initial symbol; X (“word”) is the start symbol for word derivation; ◊ (“modality”), ∇ (“coordination”), Δ (“determination”) are the operation symbols. In the syntactic rules, the commutative and associative coordination is expressed as the atomic formula X∇X in which the same symbol is used for both members in parentheses-free notation. The coordination in the rule with parentheses S→(X∇X) is necessary for expression of the left-distributive property of determination over coordinated row in strings like XΔ(X∇X…). The non-commutative and non-associative determination is expressed as an obligatory use of the different symbols and parentheses in the rules S→(XΔS), X→(XΔS). The parentheses must certainly be used to indicate the “evaluation” order of the non-associative operation. And so the atomic formula (XΔS) is represented twice: in case S→(XΔS), the operation Δ is “evaluated” from right to left as in (XΔ(XΔS…)); in case X→(XΔS), it is “evaluated” from left to right as in ((…XΔS)ΔS). The sentence syntagmatic structures are now expressed in the explicit form: every independent (head) member of the generated syntagmatic structure occupies the first position, and the dependent (defining, non-head) member is connected by means of determination symbol in the second position. So the simplest syntagmatic structure, the syntagme, has the form of a two-word string like (X1ΔX2), where X1 (the first word) is the independent (head) member; ΔX2 (the second word) is the determinative (dependent) member; X1, X2 are words representing lexical meanings; Δ (determination) is a symbol indicating a syntactic (functional) meaning of the sentence part. Note that the words X1,X2∈A* can take an empty string value, and then the initial syntagme (X1ΔX2) can be shorted either to a head member X1, representing a reduced syntagme, or to a determinative member ΔX2, representing an elliptical syntagme. Thus, syntagmatic structures can be expressed in the standard algebraic form using the parenthesis notation with the fixed order. As will be shown below, it is also possible to obtain an alternative, set-theoretical (parenthesis-free), form of syntagmatic structures. The proposed here knowledge representation uses the set- theoretical form, in which every syntagmatic structure is expressed as a set that contains all the given syntagmes as usual set’s elements. III. SET-THEORETICAL FORM OF KNOWLEDGE REPRESENTATION The set-theoretical form of knowledge representation is indeed sufficient for explicit expression of all the differences of syntagmatic structures, such as the stepwise and collateral subordinations, the homogeneous parts, the absolutely independent part, and the common depended pert (determinative), see [3]. The following identical syntagmatic structures are given by both parenthesis and set-theoretical notations. In the both notations, all differences of syntactical relations are explicit expressed. Subordination types. The stepwise subordination is expressed as a syntagmatic structure where the dependent member of the previous syntagme serves as the head member of the consequent syntagme (as in the book of the new author): (X1Δ1(X2Δ2X3))={X1Δ1X2, X2Δ2X3}, where X2 is both a dependent member of syntagme X1Δ1X2 and a head member of syntagme X2Δ2X3. In the collateral (parallel) subordination, several dependent members are connected with their common head member (as in the new book of the author): ((X1Δ1X2)Δ2X3) = {X1Δ1X2, X1Δ2X3}, where X1 is a head member; Δ1X2, Δ2X3 are dependent members. Homogeneous parts can be represented as several dependent members that are subordinated to a single head member. The determination sign represents in that case a common functional meaning of the parts of the sentence, whereas coordination sign serves as a means of juncture together the identical parts by extracting the common functional meaning outside the brackets:
  • 3. (X1Δ(X2∇X3))={X1ΔX2, X1ΔX3}, where ∇ is the symbol of coordination; ΔX2, ΔX3 are homogeneous parts. Absolutely independent part is a single syntagmatically unmarked word, which does not have its governing member and therefore it does not contain the determination sign denoting the meaning of the part of the sentence. For the sake of simplicity of sentence identification, the absolutely independent part will be emphasized in the set-theoretical form as a separate set’s element, e.g.: ((X1Δ1X2)Δ2X3) = {X1, X1Δ1X2, X1Δ2X3}, where an absolutely independent part X1 repeats itself in the form of a reduced syntagme. Determinative. There exists a possibility to express a common dependent member, called the determinative, i.e. the attribute that relates to the whole sentence. In the set- theoretical form, the determinative can be emphasized as a separate set’s element. In contrast to the emphasized absolutely independent part, the determinative contains a determination symbol, i.e. it is represented as an elliptical syntagme, e.g.: ((X1Δ1(X2Δ2X3))Δ3X4) = = {Δ3X4, X1, X1Δ1X2, X2Δ2X3}, where Δ3X4 is a determinative (e.g., In the evening, he reads a book); cp., with (X1Δ1(X2Δ2X3)Δ3X4) = = {X1, X1Δ1X2, X2Δ2X3, X2Δ3X4}, where Δ3X4 is an adverbial modifier (as in He walked in the garden in the evening). In the formal descriptions of sentences, the lexical meanings can be expressed by usual word stems, while the meanings of the sentence parts are in need of conventional signs – such as a (attribute), p (predicate), pt (predicate in past indefinite tense) o (direct object), in (adverbial modifier of place, “inside”), etc. It is convenient for clearness to separate the words and the conventional signs by the punctuation marks (the underscore character and the dot). So the real sentence, e.g., The young man reads a book in the garden, can be expressed by the following set of syntagmes: man_a.young ‘[the/a] young man’, man_p.read ‘[the/a] man reads’, read_o.book ‘[to] read [the/a] book’, read_in.garden ‘[to] read in [the/a] garden’. Thus, one can suppose that any sentence can be represented as a certain set of syntagmes. It is also assumed that any sentence consists of at least one noun phrase (a noun or a noun with several its modifiers). Then the integration of sentences that have one and the same subject, i.e. a noun phrase contained in user’s request information, can be considered as subject knowledge description (representation). So subject knowledge description can be defined as a set σ(N) of sentences S1, S2, … that contain the common subject, represented by a noun phrase N, i.e. σ(N)={S ⊇ N | S is a sentence}; and then any collection of subject knowledge descriptions produced in the knowledge formation process is a (user- oriented) subject field description, i.e. σ(N1, N2,…) = {σ(N1), σ(N2), …}, where σ(N1, N2,…) is a subject field in which N1, N2,… are noun phrases that play the role of subjects. Note that the subjects can be represented by a noun phrase expressed as the absolutely independent part, i.e. the formal subject of the division “subject-predicate”, or as the determinant, i.e. the actual subject of the division “theme-rheme”. In the text sentence, actual subject can be either coincided with the formal subject or marked by extra actualization means. In the knowledge formation process, each noun phrase, contained in user’s request, can become the formative subject role, i.e. some of the noun phrases can be specially actualized (and elevated thereby to the subject role) by special means (e.g., by the inverted word order) according to information questions. IV. RULES OF SUBJECT-KNOWLEDGE FORMATION Subject knowledge formation is a growth process in which two formation rules, namely the rules of selection and aggregation of sentences, must realize. The first of the rules permits to make a selection of all the more intensionally informative sentences by means of elimination of the sentences that are less informative than another sentence. The intensional superiority is defined considering the inclusion between the sets: one of the sentences S1 and S2 must be eliminated, if it is a subset of another sentence. That is, the rule can be formalized as follows: {S1, S2}→S1, if S1 ⊇ S2 – selection rule. The second rule realizes the integration in a collection of already selected sentences. So, if S1, S2, … are sentences that have the same subject N (a noun phrase contained in user’s request), they will unite in common subject knowledge description:
  • 4. {S1, S2, …}→ σ(N) – aggregation rule. The following examples will illustrate the selection and aggregation processes. (We restrict, for the sake of simplicity, examples to denote only simple sentences.) Let S1, S2, S3, S4, S5 be sentences, expressed in terms of formal language, such as S1 = {man, man_a.young, man_p.read, read_o.book} ‘The young man reads a book’ S2 = {man, man_a.young, man_p.read, read_o.book, read_in.library} ‘The young man reads a book in the library’ S3 = {man, man_pt.walk, walk_in.park} ‘The man walked in the park’ S4 = {library, library_pPs. situate, situate_in.street, street_a. graceful} ‘The library is situated in a graceful street’ S5 = {man, man_a.young, man_pt.kick, kick_o.ball} ‘The young man kicked the ball’ where a, in, o are signs of the secondary sentence parts; p, pt, pPs are signs of the different predicates (for the present, past indefinite, and present simple passive, respectively). According to the selection rule, the first sentence must be eliminated because of intensional superiority of the second sentence (S1 ⊆ S2). The sentences S2, S3, S4, S5 can be integrated in compliance with the aggregation rule. Let “man”, “young man”, “library” be the subjects contained in user’s request. Then, as a result of integration on the given subjects, the following three subject knowledge descriptions can be obtained: σ ({man}) = { S2, S3, S5} σ ({man, man_a.young}) = { S2, S5} σ ({library}) = { S2, S4} Note that the noun phrase “library” contained in S2 must be actualized (by the inverted word order), i.e. sentence S2 will be transformed into the actual division form: In the library, the young man reads a book. V. PROSPECTS OF APPLICATION Realization of the subject knowledge formation process (and creation of various knowledge-based systems) makes it possible to obtain effective solutions of a whole number of pressing problems. In particular, the following is noteworthy. Knowledge-based text adaptation. The subject knowledge description produced in the formation process can be used as a basis for automatic creation (synthesis) of adapted (user-oriented) text materials - such as information-analytical reviews, electronic textbooks, individual teaching materials [1]. Knowledge-based information search. The information search with great precision can be realized as a two-stage process (that resembles the ore processing): in the first stage of the process (data search, “ore mining”), the usual information retrieval is realized to draw information (as full as possible) from a number of sources (that contains valuable elements); in the second stage (knowledge search, “ore dressing”), the obtained results are processed to extract only the important information (knowledge, valuable elements). Knowledge-based machine translation. In the translation of the source text from one natural language to another, the subject knowledge base (where the lexical compatibility is fixed) can be used as a supporting interlingua, that plays the role of an effective filter for screening all the misplaced meanings of polysemous words. REFERENCES [1] M.S. Shibut, V.S. Yakovishin Method for creating customized training materials based on the processing of electronic information resources, Proceedings of the conference "Applied Linguistics in science and education", dev. memory of Professor. R.G. Piotrowski, St. Petersburg, 25-26 March 2010. - St. Petersburg, "Lem", 2010. - P. 339 – 345 (in Russian) [2] M.S. Shibut, V.S. Yakovishin Recognition of grammatical information in the process of linguistic knowledge, Topical Problems of Theoretical and Applied Linguistics: Proceedings of the Intern. scientific. Conf. dev. memory of Professor. R.G. Piotrowski, Minsk, 15-16 June 2010, Part 2. - Minsk, 2010. - P. 143-147 (in Russian) [3] V.S. Yakovishin Algebraic representation of syntagmatic structures, Web Journal of Formal, Computational & Cognitive Linguistics, Issue 11, 2009 [Electronic resource]. – Mode of access: http//fccl.ksu.ru/issue11.