1. ARASU ENGINEERING COLLEGE R.MUTHU KUMARAN (II-CSE) R.PANNEER SELVAM (II-ECE) AUTHORS, NATURAL LANGUAGE PROCESSING (NLP) TAMIL - HINDI CONVERSION
2. Now a days the information is available electronically. Indeed, there has been an explosion of text and multimedia content on the World Wide Web. For many people, a large and growing fraction of work and leisure time is spent navigating and accessing this universe of information. The presence of so much text in electronic form is a huge challenge to NLP. The Universal Networking Language (UNL) is an electronic language in the form of a semantic network that act as an intermediate representation to express and exchange every kind of information . The UNL represents information, i.e. meaning, sentence-by-sentence. Sentence information is represented as a hyper-graph having Universal Words (UWs) as nodes and relations as arcs. INTRODUCTION
3.
4. UNL TAMIL HINDI FRENCH RUSSIAN ENCONVERSION DECONVERSION
7. ASPECTS MODEL STANDARD THEORY It was in the Aspects of the Theory of Syntax nouns are chosen on the basis of context free rules ; verbs are then chosen on the basis of context sensitive rules, which are the terms to express the lexical features . Since nouns are the first words to be chosen, they are identified by lexical features only. Verbs and adjectives require additional features to indicate the environments in which they can appear. Aspects of grammar was organized into three major components:
8. EXTENDED STANDARD THEORY Ray Jackendoff offered a substantial criticism to the Standard Theory and showed that surface structure played a much more important role in semantic interpretation than the Deep structure. Here the partial representation of meaning is determined by grammatical structure. The derivation of logical form proceeds step by step which is determined by a derivational process analogous to those of syntax and phonology.
9.
10. The choice of Tamil-Hindi MAT is because, both are Free word-order languages unlike English which is a positional language. Ultimately our aim is to built a Human Aided Machine Translation System for Hindi-Tamil. A MT system basically has three major components. TAMIL-HINDI SYSTEM Tamil Word MA Generator Hindi Word Mapping Unit Tamil to Hindi Translation
19. In this paper the development of Tamil – Hindi Translation is described. In Tamil most information for generating sentence from UNL structure is tackled in morphological and syntactical level. The humble one could potentially alleviate for the most pressing issues of the NLP. The application of NLP is vast like ocean. We see a little drop of that ocean. In the feature NLP helps to comfortably communicate with computer. CONCLUSION
20.
21.
22.
23.
24.
25.
26.
27.
28. Name the component: : NP Chunker The performance of these techniques in other languages: FnTBL 98% Tamil : Present Performance: 94+% 1 st Year : 96%; 2 nd Year : 98-99% Language pair: Tamil –Hindi Name the domain for which the performance will be optimized : Crime/ Tourism Name other evaluation metrics in addition to the domain: Precision and Recall
29.
30. Name the component: : Word Generator and Local Language Splitter for Target Language Present Performance: 50% 1 st Year : 90%; 2 nd Year : 95 and above Language pair: Tamil –Hindi Name other evaluation metrics in addition to the domain: Precision, Recall and F measure
31. Name the lexical resource: Hindi- Tamil Bilingual Dictionary The final size of the lexical resource? 30,000 root word The average size of such a resource in other languages 20, 000 root words 1 st Year 15,000 root words 2 nd Year15, 000 root words Language pair: Tamil -Hindi