SlideShare a Scribd company logo
1 of 13
1. Automatic input editing
2. Automatic segmentation
3. Syntactical analysis
4. Transformation with output editing
   Japanese Characteristics
    › No spaces
    › Kanas and Kanjis
   Thus, requires
    › Automatically cutting into components
   However, to prevent too much sized dictionary
    › Regulations can be set
       Kana texts in which no kanjis are used
       Kana-kanji texts in which kanjis are used wherever
        possible according to the official directives about the
        use of kana and kanjis.
    › This is “pre-editing”
   Each kana will be Romanized
    › To preserve
       one-to-one correspondence between kanas and
        their correspondent Roman letters
    › Better analyzed with Roman letters than kanas
       Fewer varieties of suffixes
       Fewer rules of permissible combinations with
        canonical stems
       Fewer possibilities of homographic verbal stems
   Kanji will be replaced with irreducible unit
    token
    › No kanji will contain more than one
      “morpheme”
   Segmentation of a continuous run of
    tokens
    › Based on following prospects:
       Auxiliary items will be shorter in length and
        fewer in number
       No problem will be caused by:
         assuming every “phrase” in a sentence begins with a
          dictionary item
         including “prefixes” in the category of dictionary items
   Predictive analysis:
    › Originally by Rhodes
   Peculiarity seen in Japanese :
    › More convenient to start from end of sentence:
       Words having a final position in a sentence are
        limited
       Particles which show case, prepositional or
        conjunctional relationships always follow words,
        phrases or clauses to which they are attached
       Attributive words, phrases and clauses always
        stand before DT substantives which they modify
   Each word in a sentence will be assigned
    › An essence which has been fulfilled by it
    › A linkage number which shows by which word it
      has been predicted
    › A group number which shows to which clause in
      the sentence it belongs
   Another peculiarity about Japanese:
    › The subject of a sentence is very often omitted
   Hence, in this analysis:
    › Subject market and relative subject marker
      predictions is essential
   例)ネズミがネコを殺した話は私を驚かせた.
   This stage deals with the synthesis of the TL
   Brief explanation:
    › Words with same group num. are gathered
    › Transformation of word order is performed
   In concrete:
    › Subject marker, object marker & relative subject
      marker are omitted
    › Subject master or relative subject master comes
      first within each group
    › followed by predicate head or relative
      predicate head
    › and then by object master
   Readings in Machine Translation
    › Edited by Sergei Nirenburg, Harold Somers,
      and Yorick Wilks
    › The MIT Press

More Related Content

What's hot (6)

Advanced Search & Boolean Connectors
Advanced Search & Boolean ConnectorsAdvanced Search & Boolean Connectors
Advanced Search & Boolean Connectors
 
Modern Day "Witch-Hunt" 2012
Modern Day "Witch-Hunt" 2012Modern Day "Witch-Hunt" 2012
Modern Day "Witch-Hunt" 2012
 
Passive voice
Passive voicePassive voice
Passive voice
 
Passive and active voice
Passive and active voicePassive and active voice
Passive and active voice
 
format.rtf.rtf
format.rtf.rtfformat.rtf.rtf
format.rtf.rtf
 
Semi Colon
Semi Colon Semi Colon
Semi Colon
 

More from Hiroshi Matsumoto

Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...
Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...
Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...
Hiroshi Matsumoto
 
Improving translation via targeted paraphrasing
Improving translation via targeted paraphrasingImproving translation via targeted paraphrasing
Improving translation via targeted paraphrasing
Hiroshi Matsumoto
 
10.combination of sm_tn_rbmt
10.combination of sm_tn_rbmt10.combination of sm_tn_rbmt
10.combination of sm_tn_rbmt
Hiroshi Matsumoto
 
9. cgc parser with_norml_std
9. cgc parser with_norml_std9. cgc parser with_norml_std
9. cgc parser with_norml_std
Hiroshi Matsumoto
 

More from Hiroshi Matsumoto (19)

Phrase linguistic classification and generalization for improving statistical...
Phrase linguistic classification and generalization for improving statistical...Phrase linguistic classification and generalization for improving statistical...
Phrase linguistic classification and generalization for improving statistical...
 
Paraphrasing Swedish Compound Nouns in Machine Translation
Paraphrasing Swedish Compound Nouns in Machine TranslationParaphrasing Swedish Compound Nouns in Machine Translation
Paraphrasing Swedish Compound Nouns in Machine Translation
 
Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...
Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...
Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Mac...
 
Summary of Dialectal to standard Arabic paraphrasing to improve Arabic-Englis...
Summary of Dialectal to standard Arabic paraphrasing to improve Arabic-Englis...Summary of Dialectal to standard Arabic paraphrasing to improve Arabic-Englis...
Summary of Dialectal to standard Arabic paraphrasing to improve Arabic-Englis...
 
Improving translation via targeted paraphrasing
Improving translation via targeted paraphrasingImproving translation via targeted paraphrasing
Improving translation via targeted paraphrasing
 
Summary: A Sense-Based Translation Model for Statistical Machine Translation
Summary: A Sense-Based Translation Model for Statistical Machine TranslationSummary: A Sense-Based Translation Model for Statistical Machine Translation
Summary: A Sense-Based Translation Model for Statistical Machine Translation
 
Summary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationSummary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine Translation
 
Predicting Power Relations between Participants in Written Dialog from a Sing...
Predicting Power Relations between Participants in Written Dialog from a Sing...Predicting Power Relations between Participants in Written Dialog from a Sing...
Predicting Power Relations between Participants in Written Dialog from a Sing...
 
Modeling Irony in Twitter
Modeling Irony in TwitterModeling Irony in Twitter
Modeling Irony in Twitter
 
Factored translationmodel
Factored translationmodelFactored translationmodel
Factored translationmodel
 
10.combination of sm_tn_rbmt
10.combination of sm_tn_rbmt10.combination of sm_tn_rbmt
10.combination of sm_tn_rbmt
 
9. cgc parser with_norml_std
9. cgc parser with_norml_std9. cgc parser with_norml_std
9. cgc parser with_norml_std
 
8. relearnt rbmt
8. relearnt rbmt8. relearnt rbmt
8. relearnt rbmt
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
 
Summary of English Japanese Translation by MSR-MT
Summary of English Japanese Translation by MSR-MTSummary of English Japanese Translation by MSR-MT
Summary of English Japanese Translation by MSR-MT
 
5. bleu
5. bleu5. bleu
5. bleu
 
A statistical approach to machine translation
A statistical approach to machine translationA statistical approach to machine translation
A statistical approach to machine translation
 
Mt framework nagao_makoto
Mt framework nagao_makotoMt framework nagao_makoto
Mt framework nagao_makoto
 
Machine translation
Machine translationMachine translation
Machine translation
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Recently uploaded (20)

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 

Approach to japanese english automatic translation by Susumu Kuno

  • 1.
  • 2. 1. Automatic input editing 2. Automatic segmentation 3. Syntactical analysis 4. Transformation with output editing
  • 3. Japanese Characteristics › No spaces › Kanas and Kanjis  Thus, requires › Automatically cutting into components  However, to prevent too much sized dictionary › Regulations can be set  Kana texts in which no kanjis are used  Kana-kanji texts in which kanjis are used wherever possible according to the official directives about the use of kana and kanjis. › This is “pre-editing”
  • 4. Each kana will be Romanized › To preserve  one-to-one correspondence between kanas and their correspondent Roman letters › Better analyzed with Roman letters than kanas  Fewer varieties of suffixes  Fewer rules of permissible combinations with canonical stems  Fewer possibilities of homographic verbal stems  Kanji will be replaced with irreducible unit token › No kanji will contain more than one “morpheme”
  • 5. Segmentation of a continuous run of tokens › Based on following prospects:  Auxiliary items will be shorter in length and fewer in number  No problem will be caused by:  assuming every “phrase” in a sentence begins with a dictionary item  including “prefixes” in the category of dictionary items
  • 6.
  • 7. Predictive analysis: › Originally by Rhodes  Peculiarity seen in Japanese : › More convenient to start from end of sentence:  Words having a final position in a sentence are limited  Particles which show case, prepositional or conjunctional relationships always follow words, phrases or clauses to which they are attached  Attributive words, phrases and clauses always stand before DT substantives which they modify
  • 8. Each word in a sentence will be assigned › An essence which has been fulfilled by it › A linkage number which shows by which word it has been predicted › A group number which shows to which clause in the sentence it belongs  Another peculiarity about Japanese: › The subject of a sentence is very often omitted  Hence, in this analysis: › Subject market and relative subject marker predictions is essential
  • 9. 例)ネズミがネコを殺した話は私を驚かせた.
  • 10.
  • 11. This stage deals with the synthesis of the TL  Brief explanation: › Words with same group num. are gathered › Transformation of word order is performed  In concrete: › Subject marker, object marker & relative subject marker are omitted › Subject master or relative subject master comes first within each group › followed by predicate head or relative predicate head › and then by object master
  • 12.
  • 13. Readings in Machine Translation › Edited by Sergei Nirenburg, Harold Somers, and Yorick Wilks › The MIT Press