SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010
                                                                 1793-8201



     Context Based Meaning Extraction by Means of
                    Markov Logic
                                                             Imran Sarwar Bajwa


                                                                                models are knowledge-base model, stochastic logic,
   Abstract—Understanding meanings and semantics of a                            probabilistic relational model, and relational Markov model.
speech or natural language is a complicated problem. This                        A probabilistic model can also be represented as a Markov
problem becomes more vital and classy when meanings with
                                                                                 network including Bayesian networks, decision trees, logistic
respect to context, have to be extracted. This particular research
area has been typically point of meditation for the last few                     regression, etc [4]. First order logic (FOL) has been used as a
decades and many various techniques have been used to address                    tool of predicate logic. But first order logic deals only with
this problem. An automated system is required that may be able                   linear data. Natural languages tend to be non-linear. The
to analyze and understand a few paragraphs in English                            feature of non-linearity in natural languages cannot be
language. In this research, Markov Logic has been incorporated
                                                                                 knobbed by conventional first order logic. Markov Logics
to analyze and understand the natural language script given by
the user. The designed system formulates the standard speech                     (ML) is simple extension to first-order logic. In Markov Logic,
language rules with certain weights. These meticulous weights                    each formula has an additional weight fixed with it [5], in
for each rule ultimately support in deciding the particular                      variation of first order logic. In ML, a formula's associated
meaning of a phrase and sentence. The designed system                            weight reflects the strength of a constraint. The higher weight
provides an easy and consistent way to figure out speech
                                                                                 of a formula represents the greater the difference in log
language context and produce respective meanings of the text.
                                                                                 probability and it also satisfies the formula. A first-order KB
 Index Terms—Text Processing, Markov Logic,                                      is a set of formulas in first-order logic, constructed from
Meaning Extraction, Language Engineering, Context                                predicates using logical connectives and quantifiers.
Awareness.                                                                          In this article, the section 2 presents review of related work.
                                                                                 Section 3 presents the architecture of designed system.
                                                                                 Section 4 describes the used methodology based on
                         I. INTRODUCTION
                                                                                 compositely using rule based approach for extracting the
Grammar analysis and meanings extraction are primary steps                       required information from the given text and then employing
involved in almost every natural language processing based                       Markov Logics for further text understanding. The results and
systems. This scenario becomes more significant and critical                     analysis has been presented din the last section.
when the meanings of a piece of text have to be extracted in a
particular context. Context based meaning extraction is                                                II. RELATED WORK
important for many NLP (Natural Language Processing) based
                                                                                   In this section a short overview over existing approaches for
applications i.e. Automated generation of UML(Unified
                                                                                 processing and computing the natural language text has been
Modeling Language) diagrams, query processing, web mining,
                                                                                 presented.
web template designing, user interface designing, etc. Modern
information systems tend to base on agile and aspect oriented                    A. General Text Processing
mechanisms. The aspect oriented and component oriented                              The major research contributions in the area of
software engineering are becoming quite popular in software                      computational linguistics have brought into being for last
designing communities. These newly emerging methodologies                        many decades, Major contributors are Maron, M. E. and
can be assisted by providing natural languages’ based used                       Kuhns, J. L (1960) [6], Noam Chomsky (1965) [5], Chow, C.,
interfaces. The research area of improved and effective                          & Liu, C (1968) [7]. They presented the methods of
interfaces for modern software paradigms is one of the major                     information retrieval from natural languages. Other
and modern research interests. NLP based context aware user                      contributions were analysis and understanding of the natural
interfaces can be effective in this regard. This research                        languages, but still there was lot of effort required for better
highlights the prospects of assimilation of Markov Logics in                     understanding and analysis.
designing of various business and technical software.
                                                                                 B. Meaning Extraction
   For analyzing natural language text, many statistical and
non statistical models have been presented [3]. Some of the                         Natural language text can be helpful in multiple ways to
                                                                                 make computer applications more convenient and easy to use.
    Imran Sarwar Bajwa is with Department of Computer Science and IT, The        A CASE tool named REBUILDER UML [1] integrates a
Islamia University of Bahawalpur, Pakistan, Ph: +92 62 925 5466,                 module for translation of natural language text into an UML
imran.sarwar@iub.edu.pk
                                                                                 class diagram. This module uses an approach based on
                                                                            35
International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010
                                                          1793-8201
Case-Based Reasoning and Natural Language Processing. For               presented in the Figure 1.0. First component provides support
information extraction and processing at semantic level, Pedro          for phonetics and phonology related issues, if required. This
Domingos [11] also presented a technique Markov Logics.                 component further provides recognition of variations in words
Markov logics are simple extension to FOL. There are many               and this step is formally called morphology. Afterwards,
applications of Markov logics; link prediction, collective              syntax of the text is understood which requires the knowledge
classification, entity resolution, social network analysis, etc.        needed to order and group words together.
Alchemy system [12] facilitates implement the inference and                The second component provides support for semantical
learning algorithms. Alchemy is based on a declarative                  analysis of POS (parts of speech) tagged text. At this level, the
programming language that is similar to Prolog. Markov logic            Markov logic has been used to understand meanings of the
has ability to handle uncertainty and learn form the training           sentences [12]. To have a more compound understanding of the
data. We also have presented a rule based system [13] that is           sentence, pragmatics are required, where the sentence is
able to extract desired information from the natural language           analyzed according to the context. Discourse analysis is the
text. The system understands context and then extracts                  last part of NLP, where the semantic analysis of the linguistic
respective information. This model is further enhanced in this          structure is performed beyond the sentence level [13].
research to capture the information from NL text that is further
used for semantic understating. The implementation details
have been provided in section 4.
C. Information Retrieval
   The second category of prior studies concentrates on
contexts consisting of a single word only, typically modeling
the combination of a predicate p and an argument a. Kintsch
(2001) uses vector representations of p and a to identify the set
of words that are similar to both p and a. In the nineties, major
contributions were turned out by Krovetz, R., & Croft, W. B
(1992) [10], Salton, G., & McGill, M (1995) [9], Losee, R. M
(1998) [8], These authors worked for lexical ambiguity and
information retrieval [8], probabilistic indexing [9], data bases
handling [10] and so many other related areas.
   Conventional methods for natural language processing use
rule based techniques besides Hidden Markov Method (HMM)
                                                                                Fig.1   Employing Markov Logic for Meaning Extraction
[], Neural Networks (NN) [], probabilistic theory [] and
statistical methods []. Agents are another way to address this             The third and last component is the knowledge base of the
problem [8]. Scientists are used to formerly employ rule-based/         system that consists of English word libraries; data type
statistical algorithm with a text data base i.e. WordNet to             taxonomies and parse trees. The designed system has ability to
identify the data type of a text piece and then understand and          understand English language contents after reading the text
extract the desired information from the given piece of text.           scenario provided by the user. This system is based on a
Parts of speech tagging is a typical phase in this procedure in         layered design and common layers are text input acquisition
which all basic elements of the language grammar are                    layer, parts of speech tagging layer, Markov Logic based
extracted as verbs, nouns, adjectives, etc. Detailed description        Network layer, pattern analysis layer of speech language
has been provided in experiments section.                               contents and finally meaning extraction layer. The complete
                                                                        design is shown in Figure 2.0.
           III. DESIGNED SYSTEM ARCHITECTURE
                                                                        A. Getting Text Input
    In this research paper, a newly designed Markov Logic
based system has been presented that is able to read the                   Typically, the natural language text has no formal format
English language text and extract its meanings after analyzing          and accordance. To make the take more appropriate for
and extracting related information. Various linguistic phases           processing, input text is acquired. User can provide the input
are common for processing natural language i.e. lexical and             scenario in paragraphs of the text. Major issues in this phase
semantic analysis, pragmatic analysis Text parsing in NLP can           are to read the input text in the form characters and generate
be a detailed or trivial process. Some applications e.g. text           the words by concatenating them. Another major issue is to
summarization and text generation needs detailed text                   remove the text noise and abnormalities from text. In general,
processing. In detailed process every part of each sentence is          this unit is the implementation of the lexical phase of text
analyzed. Some applications just need trivial processing e.g.           processing.
text mining and web mining. In trivial processing of text, only         B. POS Tagging of Text
certain passages or phrases within a sentence are processed               Part of speech tagging is the process of assigning a
[11]
    .                                                                   part-of-speech or other lexical speech marker to each word in a
    An architecture based on three component modules is                 corpus. Rule-based taggers are the earliest taggers and they


                                                                   36
International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010
                                                            1793-8201
used hand-written disambiguation rules to assign a single                 uses MLN to identify the relations and conclude appropriate
part-of-speech to each word [12]. This module has also used a             meanings on the basis of the extractions. Distinct
rule based algorithm to categorize tokens into various classes            representations are ultimately created and assigned to various
as verbs, nouns, pronouns, adjectives, prepositions,                      linguistic inputs. The process of verifying the meanings of a
conjunctions, etc.                                                        sentence is performed by matching the input representations
                                                                          with the representation in the knowledge base [14].
C. Markov Logic based Network
   In Markov Logics [], every FOL formula has an associated                                  IV. RESULTS AND DISCUSSION
weight that represents the strength of the constraint. Markov
                                                                             To validate the presented model, four classes have defined:
logic allows contradiction between various formulas and thee
                                                                          subject class, verb class, object class, adverb class. These
contradictions are resolved by comparing the evidence weights
                                                                          classes are defined on the basis of classical division of a
of multiple constraints. A Markov Logic Network [] (MLN)
                                                                          typical English sentence.
can be viewed as a template for constructing a Markov network.
Markov logic is used in this layer to define inference rules                Student          is reading          a book         in library.
similar to the used in the Markov networks and Bayesian
networks. These inference rules are used to find most probable             Subject Part       Verb Part         Object Part     Adverb Part
meanings of the words given some evidence. MLN weights                       First of all it was observed that at which percentage the
another ability that is learning. MLN weights can learn                   words are accurately classified. For this purpose, two types of
through the maximizing of the relational database.                        data sets have been used; first data set of text for training of the
                                                                          designed system and other data set with text for testing. The
                                                                          rules of Markov Logic have been trained with the training data
                                                                          set. The weights of the rules were randomly selected and
                                                                          during training these weights were adjusted to get the optimal
                                                                          output.
                                                                             To test the accuracy of the classified text, the testing data
                                                                          set of three types, on the basis if difficulty level, are selected:
                                                                          plain, average, and compound. Plain data set is very simple
                                                                          without any phrasal and idiomatic constraints. Average data
                                                                          set is containing simple sentences but with conjunctions,
                                                                          interjections, etc. Compound data sets are reasonably complex
                                                                          than other two categories due to inclusion of phrases and even
                                                                          idioms. 10 sentences of each category were tested and
                                                                          following results were received.
                                                                                          Simple      Average        Compound       Average
                                                                           Subject          98%           96%             89%        94 %
                                                                           Verb             97%           94%             86%        92 %
                                                                           Object           95%           93%             82%        90 %
                                                                           Adverb           92%           89%             78%        86 %
          Fig.2     Markov Logic based Meaning Extraction
                                                                                          TABLEI. STATISTICS SHOWING RESULTS

D. Pattern Analysis                                                         We interpret these results as encouraging evidence for the
   In linguistic terms, verbs often specify actions, and noun             usefulness of Markov logic for judging substitutability in
phrases the objects those participate in a particular action [14].        context. Overall accuracy for a complete sentence becomes
Each noun phrase specifies that how an object participates in             90.5%. Following graph shows the results.
the action. This module typically provides support to identify
distinct objects and their respective attributes. Nouns are
symbolized as objects and their associated characteristics are
termed as attributes. In pattern analysis phase, irrelative rules
and patterns are also eliminated for efficient pattern discovery
process [13].
E. Meaning Extraction
  Prepositions can help out in developing relations among
objects identified in the previous module. This module finally

                                                                     37
International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010
                                                                     1793-8201
                                                                                     [7]    Maron, M. E. & Kuhns, J. L. (1997) “On relevance, probabilistic indexing,
      100%                                                                                  and information retrieval” Journal of the ACM, 1997, volume 7, pp:
                                                                                            216–244.
        80%                                                                          [8]    Chomsky, N. (1965) “Aspects of the Theory of Syntax. MIT Press,
                                                                    Subject                 Cambridge, Mass, 1965.
        60%                                                                          [9]    Chow, C., & Liu, C. (1968) “Approximating Discrete Probability
                                                                    Verb                    Distributions with Dependence Trees”. IEEE Transactions on
        40%                                                         Object                  Information Theory, 1968, IT-14(3), pp. 462–467.
                                                                    Adverb
                                                                                     [10]   Losee, R. M. (1988) “Parameter estimation for probabilistic document
        20%                                                                                 retrieval models”. Journal of the American Society for Information
                                                                                            Science, 39(1), 1988, pp. 8–16.
         0%                                                                          [11]   Salton, G., & McGill, M. (1995) “Introduction to Modern Information
                                                                                            Retrieval” McGraw-Hill, New York., 1995
                 Simple      Average Compound Average
                                                                                     [12]   Krovetz, R., & Croft, W. B. (1992) “Lexical ambiguity and information
                                                                                            retrieval”, ACM Transactions on Information Systems, 1992, pp.
                                                                                            115–141
                     Fig.3       Graph showing results                               [13]   Pustejovsky J, Castañ J., Zhang J, Kotecki M, Cochran B. (2002)
                                                                                                                      o
                                                                                            “Robust relational parsing over biomedical literature: Extracting inhibit
                                                                                            relations”. In proc. of Pacific Symposium of Bio-computing, pp. 362-373
                                                                                     [14]   Nerbonne, John (2003): "Natural language processing in
                             V. CONCLUSION                                                  computer-assisted language learning". In: Mitkov, R. (ed.): The Oxford
   The designed system for speech language context                                          Handbook of Computational Linguistics. Oxford, pp. 670-698.
                                                                                     [15]   D. McCarthy, R. Navigli. (2007), “SemEval-2007 Task 10: English
understanding using a rule based algorithm is a robust                                      Lexical Substitution Task” In Proceedings of SemEval, pp. 48–53.
framework for inferring the appropriate meanings from a given                        [16]   Voutilainen, A. (1995), “Constraint Grammar: A language Independent
text. The accomplished research is related to the                                           system for parsing Unrestricted Text”, Morphological disambiguation, pp.
                                                                                            165-284
understanding of the human languages. Human being needs                              [17]   Jurafsky, Daniel, and James H. Martin. (2000). “Speech and Language
specific linguistic knowledge generating and understanding                                  Processing: An Introduction to Natural Language Processing, Speech
                                                                                            Recognition, and Computational Linguistics”. Prentice-Hall, First
speech language contents. It is difficult for computers to                                  Edition.
perform this task. The speech language context understanding                         [18]   WangBin, LiuZhijing, (2003), “Web Mining Research”, Proceedings of
using a rule based framework has ability to read user provided                              Fifth International Conference on Computational Intelligence and
                                                                                            Multimedia Applications, 2003. CHINA. pp. 84 - 89
text, extract related information and ultimately give meanings
to the extracted contents. The designed system very effective
and have high accuracy up to 90 %. The designed system
depicts the meanings of the given sentence or paragraph
efficiently. An elegant graphical user interface has also been
provided to the user for entering the Input scenario in a proper
way and generating speech language contents.

                          ACKNOWLEDGMENT
   This research is part of the research work funded by HEC,
Pakistan and was conducted in the department of Computer Science
& IT, The Islamia University of Bahawalpur, Pakistan. I want to
thank Prof. Dr. Irfan Hyder (PAF-KIET) and Prof. Dr. Abass
Choudhary (CEME-NUST) for their support and guidance during
this research.

                              REFERENCES
[1]   Katrin Erk, Sebastian Pad’o (2008) “A Structured Vector Space Model
      for Word Meaning in Context”, In Proceedings of EMNLP (2008)
[2]   Imran Sarwar Bajwa, M. Abbas Choudhary, (2006) “A Rule Based
      System for Speech Language Context Understanding” Journal of
      Donghua University, (English Edition) 23 (6), pp. 39-42.
[3]   J. Henderson, (2003) “Neural network probability estimation for broad
      coverage parsing,” in Proc. of 10th conference of the European Chapter
      of the Association for Computational Linguistics (EACL 2003), pp.
      131–138.
[4]   S. Kok and P. Domingos, “Learning the structure of Markov Logic
      Networks” In Proc. ICML-05, Bonn, Germany, 2005. ACM Press, pp.
      441–448.
[5]   S. Kok, P. Singla, M. Richardson, and P. Domingos “The Alchemy
      system for statistical relational AI”, Technical report, Department of
      Computer       Science        and     Engineering,      WA,      2005.
      http://www.-cs.washington.edu/ai/alchemy .
[6]   Pedro Domingos, Stanley Kok, Hoifung Poon, Matthew Richardson,
      Parag Singla, “Unifying Logical and Statistical AI”, Proceedings of the
      Twenty-First National Conference on Artificial Intelligence (2006), pp.
      2-7




                                                                                38

Contenu connexe

Tendances

EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELEXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELijaia
 
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_StudentsNLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_StudentsHimanshu kandwal
 
A Dialogue System for Telugu, a Resource-Poor Language
A Dialogue System for Telugu, a Resource-Poor LanguageA Dialogue System for Telugu, a Resource-Poor Language
A Dialogue System for Telugu, a Resource-Poor LanguageSravanthi Mullapudi
 
Extractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language ModelExtractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language Modelgerogepatton
 
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGESYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGEijnlc
 
Santosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSESantosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSESantosh Sahu
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET Journal
 
Arabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachArabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachIJECEIAES
 
ENSEMBLE MODEL FOR CHUNKING
ENSEMBLE MODEL FOR CHUNKINGENSEMBLE MODEL FOR CHUNKING
ENSEMBLE MODEL FOR CHUNKINGijasuc
 
IRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine LearningIRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine LearningIRJET Journal
 
AUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENT
AUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENTAUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENT
AUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENTijnlc
 
My Curriculum Vitae
My Curriculum VitaeMy Curriculum Vitae
My Curriculum Vitaeadil raja
 
Language Identifier for Languages of Pakistan Including Arabic and Persian
Language Identifier for Languages of Pakistan Including Arabic and PersianLanguage Identifier for Languages of Pakistan Including Arabic and Persian
Language Identifier for Languages of Pakistan Including Arabic and PersianWaqas Tariq
 

Tendances (16)

EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODELEXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
EXTRACTIVE SUMMARIZATION WITH VERY DEEP PRETRAINED LANGUAGE MODEL
 
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_StudentsNLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
NLP_A Chat-Bot_answering_queries_of_UT-Dallas_Students
 
A Dialogue System for Telugu, a Resource-Poor Language
A Dialogue System for Telugu, a Resource-Poor LanguageA Dialogue System for Telugu, a Resource-Poor Language
A Dialogue System for Telugu, a Resource-Poor Language
 
Extractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language ModelExtractive Summarization with Very Deep Pretrained Language Model
Extractive Summarization with Very Deep Pretrained Language Model
 
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGESYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
 
Santosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSESantosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSE
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
 
40120130406014 2
40120130406014 240120130406014 2
40120130406014 2
 
Arabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approachArabic named entity recognition using deep learning approach
Arabic named entity recognition using deep learning approach
 
ENSEMBLE MODEL FOR CHUNKING
ENSEMBLE MODEL FOR CHUNKINGENSEMBLE MODEL FOR CHUNKING
ENSEMBLE MODEL FOR CHUNKING
 
IRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine LearningIRJET - Pseudocode to Python Translation using Machine Learning
IRJET - Pseudocode to Python Translation using Machine Learning
 
AUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENT
AUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENTAUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENT
AUTOMATED SQL QUERY GENERATOR BY UNDERSTANDING A NATURAL LANGUAGE STATEMENT
 
Resume_com
Resume_comResume_com
Resume_com
 
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
 
My Curriculum Vitae
My Curriculum VitaeMy Curriculum Vitae
My Curriculum Vitae
 
Language Identifier for Languages of Pakistan Including Arabic and Persian
Language Identifier for Languages of Pakistan Including Arabic and PersianLanguage Identifier for Languages of Pakistan Including Arabic and Persian
Language Identifier for Languages of Pakistan Including Arabic and Persian
 

Similaire à Meaning Extraction - IJCTE 2(1)

NL Context Understanding 23(6)
NL Context Understanding 23(6)NL Context Understanding 23(6)
NL Context Understanding 23(6)IT Industry
 
Requirement Analysis - ijcee 2(3)
Requirement Analysis - ijcee 2(3)Requirement Analysis - ijcee 2(3)
Requirement Analysis - ijcee 2(3)IT Industry
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
 
SENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORK
SENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORKSENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORK
SENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORKijnlc
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERTAbdurrahimDerric
 
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...Editor IJARCET
 
Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Nakul Sharma
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...IJwest
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...dannyijwest
 
IRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a SentenceIRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a SentenceIRJET Journal
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
 
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...Hiroshi Ono
 
Domain Specific Terminology Extraction (ICICT 2006)
Domain Specific Terminology Extraction (ICICT 2006)Domain Specific Terminology Extraction (ICICT 2006)
Domain Specific Terminology Extraction (ICICT 2006)IT Industry
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET Journal
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSijseajournal
 
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answeringAli Kabbadj
 

Similaire à Meaning Extraction - IJCTE 2(1) (20)

NL Context Understanding 23(6)
NL Context Understanding 23(6)NL Context Understanding 23(6)
NL Context Understanding 23(6)
 
Requirement Analysis - ijcee 2(3)
Requirement Analysis - ijcee 2(3)Requirement Analysis - ijcee 2(3)
Requirement Analysis - ijcee 2(3)
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
 
Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...Taxonomy extraction from automotive natural language requirements using unsup...
Taxonomy extraction from automotive natural language requirements using unsup...
 
SENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORK
SENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORKSENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORK
SENTIMENT ANALYSIS IN MYANMAR LANGUAGE USING CONVOLUTIONAL LSTM NEURAL NETWORK
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERT
 
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...
A Combined Approach to Part-of-Speech Tagging Using Features Extraction and H...
 
Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
Association Rule Mining Based Extraction of Semantic Relations Using Markov L...
 
10.1.1.35.8376
10.1.1.35.837610.1.1.35.8376
10.1.1.35.8376
 
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...Association Rule Mining Based Extraction of  Semantic Relations Using Markov ...
Association Rule Mining Based Extraction of Semantic Relations Using Markov ...
 
IRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a SentenceIRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
 
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
 
Domain Specific Terminology Extraction (ICICT 2006)
Domain Specific Terminology Extraction (ICICT 2006)Domain Specific Terminology Extraction (ICICT 2006)
Domain Specific Terminology Extraction (ICICT 2006)
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSCONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANS
 
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...
 
Modest Formalization of Software Design Patterns
Modest Formalization of Software Design PatternsModest Formalization of Software Design Patterns
Modest Formalization of Software Design Patterns
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
 

Plus de IT Industry

The News Today 24 (https://thenewstoday24.com/)
The News Today 24 (https://thenewstoday24.com/)The News Today 24 (https://thenewstoday24.com/)
The News Today 24 (https://thenewstoday24.com/)IT Industry
 
NL Interface for Database - EJSR 20(4)
NL Interface for Database - EJSR 20(4)NL Interface for Database - EJSR 20(4)
NL Interface for Database - EJSR 20(4)IT Industry
 
Virtual Telemedicine (IJITWE 5(1))
Virtual Telemedicine (IJITWE 5(1))Virtual Telemedicine (IJITWE 5(1))
Virtual Telemedicine (IJITWE 5(1))IT Industry
 
NL to OCL Transformation (EDOC 2010)
NL to OCL Transformation (EDOC 2010)NL to OCL Transformation (EDOC 2010)
NL to OCL Transformation (EDOC 2010)IT Industry
 
BPM & SOA for Small Business Enterprises (ICIME 2009)
BPM & SOA for Small Business Enterprises (ICIME 2009)BPM & SOA for Small Business Enterprises (ICIME 2009)
BPM & SOA for Small Business Enterprises (ICIME 2009)IT Industry
 
Web Layout Mining - JECS 29(2)
Web Layout Mining - JECS 29(2)Web Layout Mining - JECS 29(2)
Web Layout Mining - JECS 29(2)IT Industry
 
UCD Generator (ICIET 2007)
UCD Generator (ICIET 2007)UCD Generator (ICIET 2007)
UCD Generator (ICIET 2007)IT Industry
 
Web User Forms (ICOMMS 2006)
Web User Forms (ICOMMS 2006)Web User Forms (ICOMMS 2006)
Web User Forms (ICOMMS 2006)IT Industry
 
UML Generator (NCC18)
UML Generator (NCC18)UML Generator (NCC18)
UML Generator (NCC18)IT Industry
 
Image Classification (icast 2006)
Image Classification  (icast 2006)Image Classification  (icast 2006)
Image Classification (icast 2006)IT Industry
 
Reuse Software Components (IMS 2006)
Reuse Software Components (IMS 2006)Reuse Software Components (IMS 2006)
Reuse Software Components (IMS 2006)IT Industry
 
GIS for Quetta (ICAST 2006)
GIS for Quetta (ICAST 2006)GIS for Quetta (ICAST 2006)
GIS for Quetta (ICAST 2006)IT Industry
 
Automated Java Code Generation (ICDIM 2006)
Automated Java Code Generation (ICDIM 2006)Automated Java Code Generation (ICDIM 2006)
Automated Java Code Generation (ICDIM 2006)IT Industry
 
Web Layout Generation (IC-SCCE 2006)
Web Layout Generation (IC-SCCE 2006)Web Layout Generation (IC-SCCE 2006)
Web Layout Generation (IC-SCCE 2006)IT Industry
 
PCA Clouds (ICET 2005)
PCA Clouds (ICET 2005)PCA Clouds (ICET 2005)
PCA Clouds (ICET 2005)IT Industry
 
Feature Based Image Classification by using Principal Component Analysis
Feature Based Image Classification by using Principal Component AnalysisFeature Based Image Classification by using Principal Component Analysis
Feature Based Image Classification by using Principal Component AnalysisIT Industry
 

Plus de IT Industry (16)

The News Today 24 (https://thenewstoday24.com/)
The News Today 24 (https://thenewstoday24.com/)The News Today 24 (https://thenewstoday24.com/)
The News Today 24 (https://thenewstoday24.com/)
 
NL Interface for Database - EJSR 20(4)
NL Interface for Database - EJSR 20(4)NL Interface for Database - EJSR 20(4)
NL Interface for Database - EJSR 20(4)
 
Virtual Telemedicine (IJITWE 5(1))
Virtual Telemedicine (IJITWE 5(1))Virtual Telemedicine (IJITWE 5(1))
Virtual Telemedicine (IJITWE 5(1))
 
NL to OCL Transformation (EDOC 2010)
NL to OCL Transformation (EDOC 2010)NL to OCL Transformation (EDOC 2010)
NL to OCL Transformation (EDOC 2010)
 
BPM & SOA for Small Business Enterprises (ICIME 2009)
BPM & SOA for Small Business Enterprises (ICIME 2009)BPM & SOA for Small Business Enterprises (ICIME 2009)
BPM & SOA for Small Business Enterprises (ICIME 2009)
 
Web Layout Mining - JECS 29(2)
Web Layout Mining - JECS 29(2)Web Layout Mining - JECS 29(2)
Web Layout Mining - JECS 29(2)
 
UCD Generator (ICIET 2007)
UCD Generator (ICIET 2007)UCD Generator (ICIET 2007)
UCD Generator (ICIET 2007)
 
Web User Forms (ICOMMS 2006)
Web User Forms (ICOMMS 2006)Web User Forms (ICOMMS 2006)
Web User Forms (ICOMMS 2006)
 
UML Generator (NCC18)
UML Generator (NCC18)UML Generator (NCC18)
UML Generator (NCC18)
 
Image Classification (icast 2006)
Image Classification  (icast 2006)Image Classification  (icast 2006)
Image Classification (icast 2006)
 
Reuse Software Components (IMS 2006)
Reuse Software Components (IMS 2006)Reuse Software Components (IMS 2006)
Reuse Software Components (IMS 2006)
 
GIS for Quetta (ICAST 2006)
GIS for Quetta (ICAST 2006)GIS for Quetta (ICAST 2006)
GIS for Quetta (ICAST 2006)
 
Automated Java Code Generation (ICDIM 2006)
Automated Java Code Generation (ICDIM 2006)Automated Java Code Generation (ICDIM 2006)
Automated Java Code Generation (ICDIM 2006)
 
Web Layout Generation (IC-SCCE 2006)
Web Layout Generation (IC-SCCE 2006)Web Layout Generation (IC-SCCE 2006)
Web Layout Generation (IC-SCCE 2006)
 
PCA Clouds (ICET 2005)
PCA Clouds (ICET 2005)PCA Clouds (ICET 2005)
PCA Clouds (ICET 2005)
 
Feature Based Image Classification by using Principal Component Analysis
Feature Based Image Classification by using Principal Component AnalysisFeature Based Image Classification by using Principal Component Analysis
Feature Based Image Classification by using Principal Component Analysis
 

Meaning Extraction - IJCTE 2(1)

  • 1. International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010 1793-8201 Context Based Meaning Extraction by Means of Markov Logic Imran Sarwar Bajwa  models are knowledge-base model, stochastic logic, Abstract—Understanding meanings and semantics of a probabilistic relational model, and relational Markov model. speech or natural language is a complicated problem. This A probabilistic model can also be represented as a Markov problem becomes more vital and classy when meanings with network including Bayesian networks, decision trees, logistic respect to context, have to be extracted. This particular research area has been typically point of meditation for the last few regression, etc [4]. First order logic (FOL) has been used as a decades and many various techniques have been used to address tool of predicate logic. But first order logic deals only with this problem. An automated system is required that may be able linear data. Natural languages tend to be non-linear. The to analyze and understand a few paragraphs in English feature of non-linearity in natural languages cannot be language. In this research, Markov Logic has been incorporated knobbed by conventional first order logic. Markov Logics to analyze and understand the natural language script given by the user. The designed system formulates the standard speech (ML) is simple extension to first-order logic. In Markov Logic, language rules with certain weights. These meticulous weights each formula has an additional weight fixed with it [5], in for each rule ultimately support in deciding the particular variation of first order logic. In ML, a formula's associated meaning of a phrase and sentence. The designed system weight reflects the strength of a constraint. The higher weight provides an easy and consistent way to figure out speech of a formula represents the greater the difference in log language context and produce respective meanings of the text. probability and it also satisfies the formula. A first-order KB Index Terms—Text Processing, Markov Logic, is a set of formulas in first-order logic, constructed from Meaning Extraction, Language Engineering, Context predicates using logical connectives and quantifiers. Awareness. In this article, the section 2 presents review of related work. Section 3 presents the architecture of designed system. Section 4 describes the used methodology based on I. INTRODUCTION compositely using rule based approach for extracting the Grammar analysis and meanings extraction are primary steps required information from the given text and then employing involved in almost every natural language processing based Markov Logics for further text understanding. The results and systems. This scenario becomes more significant and critical analysis has been presented din the last section. when the meanings of a piece of text have to be extracted in a particular context. Context based meaning extraction is II. RELATED WORK important for many NLP (Natural Language Processing) based In this section a short overview over existing approaches for applications i.e. Automated generation of UML(Unified processing and computing the natural language text has been Modeling Language) diagrams, query processing, web mining, presented. web template designing, user interface designing, etc. Modern information systems tend to base on agile and aspect oriented A. General Text Processing mechanisms. The aspect oriented and component oriented The major research contributions in the area of software engineering are becoming quite popular in software computational linguistics have brought into being for last designing communities. These newly emerging methodologies many decades, Major contributors are Maron, M. E. and can be assisted by providing natural languages’ based used Kuhns, J. L (1960) [6], Noam Chomsky (1965) [5], Chow, C., interfaces. The research area of improved and effective & Liu, C (1968) [7]. They presented the methods of interfaces for modern software paradigms is one of the major information retrieval from natural languages. Other and modern research interests. NLP based context aware user contributions were analysis and understanding of the natural interfaces can be effective in this regard. This research languages, but still there was lot of effort required for better highlights the prospects of assimilation of Markov Logics in understanding and analysis. designing of various business and technical software. B. Meaning Extraction For analyzing natural language text, many statistical and non statistical models have been presented [3]. Some of the Natural language text can be helpful in multiple ways to make computer applications more convenient and easy to use. Imran Sarwar Bajwa is with Department of Computer Science and IT, The A CASE tool named REBUILDER UML [1] integrates a Islamia University of Bahawalpur, Pakistan, Ph: +92 62 925 5466, module for translation of natural language text into an UML imran.sarwar@iub.edu.pk class diagram. This module uses an approach based on 35
  • 2. International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010 1793-8201 Case-Based Reasoning and Natural Language Processing. For presented in the Figure 1.0. First component provides support information extraction and processing at semantic level, Pedro for phonetics and phonology related issues, if required. This Domingos [11] also presented a technique Markov Logics. component further provides recognition of variations in words Markov logics are simple extension to FOL. There are many and this step is formally called morphology. Afterwards, applications of Markov logics; link prediction, collective syntax of the text is understood which requires the knowledge classification, entity resolution, social network analysis, etc. needed to order and group words together. Alchemy system [12] facilitates implement the inference and The second component provides support for semantical learning algorithms. Alchemy is based on a declarative analysis of POS (parts of speech) tagged text. At this level, the programming language that is similar to Prolog. Markov logic Markov logic has been used to understand meanings of the has ability to handle uncertainty and learn form the training sentences [12]. To have a more compound understanding of the data. We also have presented a rule based system [13] that is sentence, pragmatics are required, where the sentence is able to extract desired information from the natural language analyzed according to the context. Discourse analysis is the text. The system understands context and then extracts last part of NLP, where the semantic analysis of the linguistic respective information. This model is further enhanced in this structure is performed beyond the sentence level [13]. research to capture the information from NL text that is further used for semantic understating. The implementation details have been provided in section 4. C. Information Retrieval The second category of prior studies concentrates on contexts consisting of a single word only, typically modeling the combination of a predicate p and an argument a. Kintsch (2001) uses vector representations of p and a to identify the set of words that are similar to both p and a. In the nineties, major contributions were turned out by Krovetz, R., & Croft, W. B (1992) [10], Salton, G., & McGill, M (1995) [9], Losee, R. M (1998) [8], These authors worked for lexical ambiguity and information retrieval [8], probabilistic indexing [9], data bases handling [10] and so many other related areas. Conventional methods for natural language processing use rule based techniques besides Hidden Markov Method (HMM) Fig.1 Employing Markov Logic for Meaning Extraction [], Neural Networks (NN) [], probabilistic theory [] and statistical methods []. Agents are another way to address this The third and last component is the knowledge base of the problem [8]. Scientists are used to formerly employ rule-based/ system that consists of English word libraries; data type statistical algorithm with a text data base i.e. WordNet to taxonomies and parse trees. The designed system has ability to identify the data type of a text piece and then understand and understand English language contents after reading the text extract the desired information from the given piece of text. scenario provided by the user. This system is based on a Parts of speech tagging is a typical phase in this procedure in layered design and common layers are text input acquisition which all basic elements of the language grammar are layer, parts of speech tagging layer, Markov Logic based extracted as verbs, nouns, adjectives, etc. Detailed description Network layer, pattern analysis layer of speech language has been provided in experiments section. contents and finally meaning extraction layer. The complete design is shown in Figure 2.0. III. DESIGNED SYSTEM ARCHITECTURE A. Getting Text Input In this research paper, a newly designed Markov Logic based system has been presented that is able to read the Typically, the natural language text has no formal format English language text and extract its meanings after analyzing and accordance. To make the take more appropriate for and extracting related information. Various linguistic phases processing, input text is acquired. User can provide the input are common for processing natural language i.e. lexical and scenario in paragraphs of the text. Major issues in this phase semantic analysis, pragmatic analysis Text parsing in NLP can are to read the input text in the form characters and generate be a detailed or trivial process. Some applications e.g. text the words by concatenating them. Another major issue is to summarization and text generation needs detailed text remove the text noise and abnormalities from text. In general, processing. In detailed process every part of each sentence is this unit is the implementation of the lexical phase of text analyzed. Some applications just need trivial processing e.g. processing. text mining and web mining. In trivial processing of text, only B. POS Tagging of Text certain passages or phrases within a sentence are processed Part of speech tagging is the process of assigning a [11] . part-of-speech or other lexical speech marker to each word in a An architecture based on three component modules is corpus. Rule-based taggers are the earliest taggers and they 36
  • 3. International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010 1793-8201 used hand-written disambiguation rules to assign a single uses MLN to identify the relations and conclude appropriate part-of-speech to each word [12]. This module has also used a meanings on the basis of the extractions. Distinct rule based algorithm to categorize tokens into various classes representations are ultimately created and assigned to various as verbs, nouns, pronouns, adjectives, prepositions, linguistic inputs. The process of verifying the meanings of a conjunctions, etc. sentence is performed by matching the input representations with the representation in the knowledge base [14]. C. Markov Logic based Network In Markov Logics [], every FOL formula has an associated IV. RESULTS AND DISCUSSION weight that represents the strength of the constraint. Markov To validate the presented model, four classes have defined: logic allows contradiction between various formulas and thee subject class, verb class, object class, adverb class. These contradictions are resolved by comparing the evidence weights classes are defined on the basis of classical division of a of multiple constraints. A Markov Logic Network [] (MLN) typical English sentence. can be viewed as a template for constructing a Markov network. Markov logic is used in this layer to define inference rules Student is reading a book in library. similar to the used in the Markov networks and Bayesian networks. These inference rules are used to find most probable Subject Part Verb Part Object Part Adverb Part meanings of the words given some evidence. MLN weights First of all it was observed that at which percentage the another ability that is learning. MLN weights can learn words are accurately classified. For this purpose, two types of through the maximizing of the relational database. data sets have been used; first data set of text for training of the designed system and other data set with text for testing. The rules of Markov Logic have been trained with the training data set. The weights of the rules were randomly selected and during training these weights were adjusted to get the optimal output. To test the accuracy of the classified text, the testing data set of three types, on the basis if difficulty level, are selected: plain, average, and compound. Plain data set is very simple without any phrasal and idiomatic constraints. Average data set is containing simple sentences but with conjunctions, interjections, etc. Compound data sets are reasonably complex than other two categories due to inclusion of phrases and even idioms. 10 sentences of each category were tested and following results were received. Simple Average Compound Average Subject 98% 96% 89% 94 % Verb 97% 94% 86% 92 % Object 95% 93% 82% 90 % Adverb 92% 89% 78% 86 % Fig.2 Markov Logic based Meaning Extraction TABLEI. STATISTICS SHOWING RESULTS D. Pattern Analysis We interpret these results as encouraging evidence for the In linguistic terms, verbs often specify actions, and noun usefulness of Markov logic for judging substitutability in phrases the objects those participate in a particular action [14]. context. Overall accuracy for a complete sentence becomes Each noun phrase specifies that how an object participates in 90.5%. Following graph shows the results. the action. This module typically provides support to identify distinct objects and their respective attributes. Nouns are symbolized as objects and their associated characteristics are termed as attributes. In pattern analysis phase, irrelative rules and patterns are also eliminated for efficient pattern discovery process [13]. E. Meaning Extraction Prepositions can help out in developing relations among objects identified in the previous module. This module finally 37
  • 4. International Journal of Computer Theory and Engineering, Vol. 2, No. 1 February, 2010 1793-8201 [7] Maron, M. E. & Kuhns, J. L. (1997) “On relevance, probabilistic indexing, 100% and information retrieval” Journal of the ACM, 1997, volume 7, pp: 216–244. 80% [8] Chomsky, N. (1965) “Aspects of the Theory of Syntax. MIT Press, Subject Cambridge, Mass, 1965. 60% [9] Chow, C., & Liu, C. (1968) “Approximating Discrete Probability Verb Distributions with Dependence Trees”. IEEE Transactions on 40% Object Information Theory, 1968, IT-14(3), pp. 462–467. Adverb [10] Losee, R. M. (1988) “Parameter estimation for probabilistic document 20% retrieval models”. Journal of the American Society for Information Science, 39(1), 1988, pp. 8–16. 0% [11] Salton, G., & McGill, M. (1995) “Introduction to Modern Information Retrieval” McGraw-Hill, New York., 1995 Simple Average Compound Average [12] Krovetz, R., & Croft, W. B. (1992) “Lexical ambiguity and information retrieval”, ACM Transactions on Information Systems, 1992, pp. 115–141 Fig.3 Graph showing results [13] Pustejovsky J, Castañ J., Zhang J, Kotecki M, Cochran B. (2002) o “Robust relational parsing over biomedical literature: Extracting inhibit relations”. In proc. of Pacific Symposium of Bio-computing, pp. 362-373 [14] Nerbonne, John (2003): "Natural language processing in V. CONCLUSION computer-assisted language learning". In: Mitkov, R. (ed.): The Oxford The designed system for speech language context Handbook of Computational Linguistics. Oxford, pp. 670-698. [15] D. McCarthy, R. Navigli. (2007), “SemEval-2007 Task 10: English understanding using a rule based algorithm is a robust Lexical Substitution Task” In Proceedings of SemEval, pp. 48–53. framework for inferring the appropriate meanings from a given [16] Voutilainen, A. (1995), “Constraint Grammar: A language Independent text. The accomplished research is related to the system for parsing Unrestricted Text”, Morphological disambiguation, pp. 165-284 understanding of the human languages. Human being needs [17] Jurafsky, Daniel, and James H. Martin. (2000). “Speech and Language specific linguistic knowledge generating and understanding Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics”. Prentice-Hall, First speech language contents. It is difficult for computers to Edition. perform this task. The speech language context understanding [18] WangBin, LiuZhijing, (2003), “Web Mining Research”, Proceedings of using a rule based framework has ability to read user provided Fifth International Conference on Computational Intelligence and Multimedia Applications, 2003. CHINA. pp. 84 - 89 text, extract related information and ultimately give meanings to the extracted contents. The designed system very effective and have high accuracy up to 90 %. The designed system depicts the meanings of the given sentence or paragraph efficiently. An elegant graphical user interface has also been provided to the user for entering the Input scenario in a proper way and generating speech language contents. ACKNOWLEDGMENT This research is part of the research work funded by HEC, Pakistan and was conducted in the department of Computer Science & IT, The Islamia University of Bahawalpur, Pakistan. I want to thank Prof. Dr. Irfan Hyder (PAF-KIET) and Prof. Dr. Abass Choudhary (CEME-NUST) for their support and guidance during this research. REFERENCES [1] Katrin Erk, Sebastian Pad’o (2008) “A Structured Vector Space Model for Word Meaning in Context”, In Proceedings of EMNLP (2008) [2] Imran Sarwar Bajwa, M. Abbas Choudhary, (2006) “A Rule Based System for Speech Language Context Understanding” Journal of Donghua University, (English Edition) 23 (6), pp. 39-42. [3] J. Henderson, (2003) “Neural network probability estimation for broad coverage parsing,” in Proc. of 10th conference of the European Chapter of the Association for Computational Linguistics (EACL 2003), pp. 131–138. [4] S. Kok and P. Domingos, “Learning the structure of Markov Logic Networks” In Proc. ICML-05, Bonn, Germany, 2005. ACM Press, pp. 441–448. [5] S. Kok, P. Singla, M. Richardson, and P. Domingos “The Alchemy system for statistical relational AI”, Technical report, Department of Computer Science and Engineering, WA, 2005. http://www.-cs.washington.edu/ai/alchemy . [6] Pedro Domingos, Stanley Kok, Hoifung Poon, Matthew Richardson, Parag Singla, “Unifying Logical and Statistical AI”, Proceedings of the Twenty-First National Conference on Artificial Intelligence (2006), pp. 2-7 38