SlideShare a Scribd company logo
1 of 31
The Abstract that Machine Learns to do
Languages Translation
Sai Htaung Kham
2019/02/15
@Tech Meetup For MM Engineers in JP #2
What is Language Translation?
- I decided to visit Myanmar so that I could meet and understand the locals in
person.
- Although there are many technologies out there, I do believe that the AI will be
one of the future technologies.
- I decided to visit Myanmar so that I could meet and understand the locals in
person.
- ဒေသခံဒ ွေန ဲ့ပုဂ္ဂိုလ်ဒရေးအရ ဒ ွေွေ့ဆံုနဂုင်ဒအောင် မြန်ြောမပည်ကဂု သွေောေးဖဂု ဲ့ကျွန်ဒ ော် ဆံုေးမဖ ်
လဂုက် ယ်။
- Although there are many technologies out there, I do believe that the AI will be
one of the future technologies.
- အေီြော နည်ေးပညောဒ ွေအြ ောေးကကီေး ရဂဒပြယဲ့် AI ဟော အနောဂ ် နည်ေးပညော စ်ခု
မဖစ်လဂြဲ့်ြယ်လဂု ဲ့ကျွန်ဒ ော် ထင် ယ်။
Translated by Google Translate
What are the building blocks of Language Translation
in Computer Science?
NLP (Natural Language Processing)
What is NLP?
Natural Language Processing is a range of technologies that enable machines to
understand, analyze and respond appropriately to natural language.
By Cedric Wagrez (Gengo)
This allows interacting with computers with natural language instead of
computer code.
Common Tasks in NLP
- Name Entity Recognition
- Sentiment Analysis
- Information Extraction
- Text Generation
- Machine Translation and so on.
Name Entity Recognition
Sentiment Analysis
A bit about Machine Translation
- Rule-based machine translation.
- Example-based machine translation.
- Statistical machine translation.
- Neural machine translation.
Example of Rule-based Machine Translation
Input I am a student .
Intermediate Layer ကျွန်ုပ် မဖစ်သည် စ်ဒယောက် ဒက ောင်ေးသောေး ။ Direct Translation
Output
ကျွန်ုပ် ဒက ောင်ေးသောေး စ်ဒယောက် မဖစ်သည် ။
Filter out or Rephrasing with defined
rules
Example-based Machine Translation
English Japanese
How much is that red umbrella? Ano akai kasa wa ikura desu ka.
How much is that small camera? Ano chiisai kamera wa ikura desu ka.
https://en.wikipedia.org/wiki/Example-based_machine_translation
Statistical Machine Translation - 1
http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
Statistical Machine Translation - 2
http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
Statistical Machine Translation - 3
https://thefrenchturtle.wordpress.com/2013/01/16/the-effectiveness-of-online-translators/
Statistical Machine Translation - 4
Neural Machine Translation
- Machine Learning, Deep Learning Based Approach.
- End to End Translation.
Some old but gold techniques
- 2014 - Sequence to Sequence
- 2014, 2015 - Attention Mechanism
- 2017 June - Transformer Architecture - Google AI
- 2018 May - Universal Language Model Fine Tuning - fast.ai
- 2018 November - BERT - Google AI
- 2019 February - OpenAI GPT2
- 2019 July - ERNIE - Baidu Research
- 2019 July - RoBERTa - Facebook Research
- 2019 August - MegatronLM - NVidia
- 2020 January - Turing Natural Language Generation (T-NLG)
State of the art Models with parameter counts
https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/?OCID=msr_blog_turing_hero
Part of Speech (POS) Tagging
ြဒကွေေး ခရဂုင် သည် အဒနောက် ဘက် ဧရောဝ ီ မြစ်ကြ်ေး စ်ဒလ ောက် ွေင် နံုေး င် ဒမြနုရပ်ကွေက် မဖစ် သည် ။
Noun Noun
Post-positional
Marker Noun Noun Noun Noun Noun
Post-positional
Marker Noun Noun Verb
Post-positional
Marker Punctuation
POS Tagging - 1
'(?:(?<!္)([က-ဪဿ၊-၏]|[၀-၉]+|[^က-၏]+)(?![္္ ]?[္္်္ ဲ့]))'
https://github.com/ye-kyaw-thu/sylbreak
https://github.com/swanhtet1992/ReSegment
- ဝဏ္ဏ (Syllable) Splitting
၂၀၁၈ခုနစ်အောရအောေးကစောေးပပဂိုင်ပွေ ွေင် အောေးကစောေးနည်ေးအဒရအ ွေက် ဂုေးမြငဲ့်လောခဲ့
၂၀၁၈', 'ခု', 'နစ်', 'အော', 'ရ', 'အောေး', 'က', 'စောေး', 'ပပဂိုင်', 'ပွေ', ' ွေင်', ' ', 'အောေး', 'က', 'စောေး', 'နည်ေး', 'အ', 'ဒရ', 'အ', ' ွေက်', ' ', ' ဂုေး', 'မြငဲ့်', 'လော', 'ခဲ့'
POS Tagging - 2
https://github.com/ye-kyaw-thu/sylbreak
https://github.com/swanhtet1992/ReSegment
- Training the Model
Model
ြ ဒကွေေး
Noun Noun
POS Tagging - 3
https://arxiv.org/pdf/1409.0473.pdf
Bahdanau et al., 2015
POS Tagging - 4
https://arxiv.org/pdf/1409.0473.pdf
20% 30% 10% 40%
This is an apple
ေါကပန်ေးသီေး
POS Tagging - 5
https://arxiv.org/pdf/1409.0473.pdf
I decided to visit Myanmar so that I could meet and understand the
locals in person.
ဒေသခံဒ ွေန ဲ့ပုဂ္ဂိုလ်ဒရေးအရ ဒ ွေွေ့ဆံုနဂုင်ဒအောင် မြန်ြောမပည် ကဂု သွေောေးဖဂု ဲ့
ကျွန်ဒ ော် ဆံုေးမဖ ် လဂုက် ယ်။
POS Tagging - 6
https://arxiv.org/pdf/1409.0473.pdf
Bahdanau et al., 2015
This is an apple
ေါကပန်ေးသီေး
20% 30% 10% 40%
Encoder
Decoder
Attention Layer
POS Tagging - 7
https://arxiv.org/pdf/1409.0473.pdf
Bahdanau et al., 2015
20% 30% 10% 40%
Encoder
Decoder
Attention Layer
နံုေး င် ဒမြနုရပ်ကွေက် မဖစ် သည်
Noun Noun Verb
Post-positional
Marker
POS Tagging - 8
POS Tagging - 9
The Abstract that Machine Learns to do Languages Translation

More Related Content

Similar to The Abstract that Machine Learns to do Languages Translation

Intuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyondIntuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyondC4Media
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingSeth Grimes
 
⛳️ Votre API passe-t-elle le contrôle technique ?
⛳️ Votre API passe-t-elle le contrôle technique ?⛳️ Votre API passe-t-elle le contrôle technique ?
⛳️ Votre API passe-t-elle le contrôle technique ?François-Guillaume Ribreau
 
Learning for sequences - Adam Mathias
Learning for sequences  - Adam MathiasLearning for sequences  - Adam Mathias
Learning for sequences - Adam MathiasDataFest Tbilisi
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationIconic Translation Machines
 
UMLassure: An approach to model software security
UMLassure: An approach to model software securityUMLassure: An approach to model software security
UMLassure: An approach to model software securitymanishthaper
 
Model Driven Engineering of Rich Internet Applications Equipped with Zoomabl...
Model Driven Engineering of Rich Internet Applications  Equipped with Zoomabl...Model Driven Engineering of Rich Internet Applications  Equipped with Zoomabl...
Model Driven Engineering of Rich Internet Applications Equipped with Zoomabl...Jean Vanderdonckt
 
Ie essay j yongchan_lee
Ie essay j yongchan_leeIe essay j yongchan_lee
Ie essay j yongchan_leeYONGCHANLEE4
 
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位eLearning Consortium 電子學習聯盟
 
a introduction for machine learning class
a introduction for machine learning classa introduction for machine learning class
a introduction for machine learning classyjlj9555
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...John Tinsley
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Iconic Translation Machines
 
Gala Webminar September 2013
Gala Webminar September 2013Gala Webminar September 2013
Gala Webminar September 2013pangeanic
 
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?eLearning Consortium 電子學習聯盟
 
Lessons from Indic OCR Development
Lessons from Indic OCR DevelopmentLessons from Indic OCR Development
Lessons from Indic OCR DevelopmentNishad Thalhath
 
MichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptxMichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptxAWS Chicago
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automationbenosteen
 
Juraj vysvader - Python developer's CV
Juraj vysvader - Python developer's CVJuraj vysvader - Python developer's CV
Juraj vysvader - Python developer's CVJuraj Vysvader
 

Similar to The Abstract that Machine Learns to do Languages Translation (20)

Intuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyondIntuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyond
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
⛳️ Votre API passe-t-elle le contrôle technique ?
⛳️ Votre API passe-t-elle le contrôle technique ?⛳️ Votre API passe-t-elle le contrôle technique ?
⛳️ Votre API passe-t-elle le contrôle technique ?
 
Learning for sequences - Adam Mathias
Learning for sequences  - Adam MathiasLearning for sequences  - Adam Mathias
Learning for sequences - Adam Mathias
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine Translation
 
UMLassure: An approach to model software security
UMLassure: An approach to model software securityUMLassure: An approach to model software security
UMLassure: An approach to model software security
 
Model Driven Engineering of Rich Internet Applications Equipped with Zoomabl...
Model Driven Engineering of Rich Internet Applications  Equipped with Zoomabl...Model Driven Engineering of Rich Internet Applications  Equipped with Zoomabl...
Model Driven Engineering of Rich Internet Applications Equipped with Zoomabl...
 
Pc mockups
Pc mockupsPc mockups
Pc mockups
 
Ie essay j yongchan_lee
Ie essay j yongchan_leeIe essay j yongchan_lee
Ie essay j yongchan_lee
 
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
AI生成工具的新衝擊 - MS Bing & Google Bard 能否挑戰ChatGPT-4領導地位
 
a introduction for machine learning class
a introduction for machine learning classa introduction for machine learning class
a introduction for machine learning class
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
Gala Webminar September 2013
Gala Webminar September 2013Gala Webminar September 2013
Gala Webminar September 2013
 
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
ChatGPT 顛覆傳統的科技創新 - 不僅文字工作者會被AI取代?
 
Lessons from Indic OCR Development
Lessons from Indic OCR DevelopmentLessons from Indic OCR Development
Lessons from Indic OCR Development
 
MichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptxMichaelSoule-UsingJupyterNotebooks.pptx
MichaelSoule-UsingJupyterNotebooks.pptx
 
Alabot
AlabotAlabot
Alabot
 
Text-mining and Automation
Text-mining and AutomationText-mining and Automation
Text-mining and Automation
 
Juraj vysvader - Python developer's CV
Juraj vysvader - Python developer's CVJuraj vysvader - Python developer's CV
Juraj vysvader - Python developer's CV
 

More from TMME - TECH MEETUP FOR MYANMAR ENGINEERS IN JP (7)

JHipster - Full Stack Platform for the Modern Developer
JHipster - Full Stack Platform for the Modern DeveloperJHipster - Full Stack Platform for the Modern Developer
JHipster - Full Stack Platform for the Modern Developer
 
Microservices development for DevOps
Microservices development for DevOpsMicroservices development for DevOps
Microservices development for DevOps
 
AWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern ArchitectureAWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern Architecture
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence
 
Creating a SPA blog withAngular and Cloud Firestore
Creating a SPA blog withAngular and Cloud FirestoreCreating a SPA blog withAngular and Cloud Firestore
Creating a SPA blog withAngular and Cloud Firestore
 
Building desktop applications for fun with electron
Building desktop applications for fun with electronBuilding desktop applications for fun with electron
Building desktop applications for fun with electron
 
Importance Of Alert And Notification In App Dev
Importance Of Alert And Notification In App DevImportance Of Alert And Notification In App Dev
Importance Of Alert And Notification In App Dev
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

The Abstract that Machine Learns to do Languages Translation

  • 1. The Abstract that Machine Learns to do Languages Translation Sai Htaung Kham 2019/02/15 @Tech Meetup For MM Engineers in JP #2
  • 2. What is Language Translation?
  • 3. - I decided to visit Myanmar so that I could meet and understand the locals in person. - Although there are many technologies out there, I do believe that the AI will be one of the future technologies.
  • 4. - I decided to visit Myanmar so that I could meet and understand the locals in person. - ဒေသခံဒ ွေန ဲ့ပုဂ္ဂိုလ်ဒရေးအရ ဒ ွေွေ့ဆံုနဂုင်ဒအောင် မြန်ြောမပည်ကဂု သွေောေးဖဂု ဲ့ကျွန်ဒ ော် ဆံုေးမဖ ် လဂုက် ယ်။ - Although there are many technologies out there, I do believe that the AI will be one of the future technologies. - အေီြော နည်ေးပညောဒ ွေအြ ောေးကကီေး ရဂဒပြယဲ့် AI ဟော အနောဂ ် နည်ေးပညော စ်ခု မဖစ်လဂြဲ့်ြယ်လဂု ဲ့ကျွန်ဒ ော် ထင် ယ်။ Translated by Google Translate
  • 5. What are the building blocks of Language Translation in Computer Science?
  • 6. NLP (Natural Language Processing)
  • 7. What is NLP? Natural Language Processing is a range of technologies that enable machines to understand, analyze and respond appropriately to natural language. By Cedric Wagrez (Gengo) This allows interacting with computers with natural language instead of computer code.
  • 8. Common Tasks in NLP - Name Entity Recognition - Sentiment Analysis - Information Extraction - Text Generation - Machine Translation and so on.
  • 11. A bit about Machine Translation - Rule-based machine translation. - Example-based machine translation. - Statistical machine translation. - Neural machine translation.
  • 12. Example of Rule-based Machine Translation Input I am a student . Intermediate Layer ကျွန်ုပ် မဖစ်သည် စ်ဒယောက် ဒက ောင်ေးသောေး ။ Direct Translation Output ကျွန်ုပ် ဒက ောင်ေးသောေး စ်ဒယောက် မဖစ်သည် ။ Filter out or Rephrasing with defined rules
  • 13. Example-based Machine Translation English Japanese How much is that red umbrella? Ano akai kasa wa ikura desu ka. How much is that small camera? Ano chiisai kamera wa ikura desu ka. https://en.wikipedia.org/wiki/Example-based_machine_translation
  • 14. Statistical Machine Translation - 1 http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
  • 15. Statistical Machine Translation - 2 http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
  • 16. Statistical Machine Translation - 3 https://thefrenchturtle.wordpress.com/2013/01/16/the-effectiveness-of-online-translators/
  • 18. Neural Machine Translation - Machine Learning, Deep Learning Based Approach. - End to End Translation.
  • 19. Some old but gold techniques - 2014 - Sequence to Sequence - 2014, 2015 - Attention Mechanism - 2017 June - Transformer Architecture - Google AI - 2018 May - Universal Language Model Fine Tuning - fast.ai - 2018 November - BERT - Google AI - 2019 February - OpenAI GPT2 - 2019 July - ERNIE - Baidu Research - 2019 July - RoBERTa - Facebook Research - 2019 August - MegatronLM - NVidia - 2020 January - Turing Natural Language Generation (T-NLG)
  • 20. State of the art Models with parameter counts https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/?OCID=msr_blog_turing_hero
  • 21. Part of Speech (POS) Tagging ြဒကွေေး ခရဂုင် သည် အဒနောက် ဘက် ဧရောဝ ီ မြစ်ကြ်ေး စ်ဒလ ောက် ွေင် နံုေး င် ဒမြနုရပ်ကွေက် မဖစ် သည် ။ Noun Noun Post-positional Marker Noun Noun Noun Noun Noun Post-positional Marker Noun Noun Verb Post-positional Marker Punctuation
  • 22. POS Tagging - 1 '(?:(?<!္)([က-ဪဿ၊-၏]|[၀-၉]+|[^က-၏]+)(?![္္ ]?[္္်္ ဲ့]))' https://github.com/ye-kyaw-thu/sylbreak https://github.com/swanhtet1992/ReSegment - ဝဏ္ဏ (Syllable) Splitting ၂၀၁၈ခုနစ်အောရအောေးကစောေးပပဂိုင်ပွေ ွေင် အောေးကစောေးနည်ေးအဒရအ ွေက် ဂုေးမြငဲ့်လောခဲ့ ၂၀၁၈', 'ခု', 'နစ်', 'အော', 'ရ', 'အောေး', 'က', 'စောေး', 'ပပဂိုင်', 'ပွေ', ' ွေင်', ' ', 'အောေး', 'က', 'စောေး', 'နည်ေး', 'အ', 'ဒရ', 'အ', ' ွေက်', ' ', ' ဂုေး', 'မြငဲ့်', 'လော', 'ခဲ့'
  • 23. POS Tagging - 2 https://github.com/ye-kyaw-thu/sylbreak https://github.com/swanhtet1992/ReSegment - Training the Model Model ြ ဒကွေေး Noun Noun
  • 24. POS Tagging - 3 https://arxiv.org/pdf/1409.0473.pdf Bahdanau et al., 2015
  • 25. POS Tagging - 4 https://arxiv.org/pdf/1409.0473.pdf 20% 30% 10% 40% This is an apple ေါကပန်ေးသီေး
  • 26. POS Tagging - 5 https://arxiv.org/pdf/1409.0473.pdf I decided to visit Myanmar so that I could meet and understand the locals in person. ဒေသခံဒ ွေန ဲ့ပုဂ္ဂိုလ်ဒရေးအရ ဒ ွေွေ့ဆံုနဂုင်ဒအောင် မြန်ြောမပည် ကဂု သွေောေးဖဂု ဲ့ ကျွန်ဒ ော် ဆံုေးမဖ ် လဂုက် ယ်။
  • 27. POS Tagging - 6 https://arxiv.org/pdf/1409.0473.pdf Bahdanau et al., 2015 This is an apple ေါကပန်ေးသီေး 20% 30% 10% 40% Encoder Decoder Attention Layer
  • 28. POS Tagging - 7 https://arxiv.org/pdf/1409.0473.pdf Bahdanau et al., 2015 20% 30% 10% 40% Encoder Decoder Attention Layer နံုေး င် ဒမြနုရပ်ကွေက် မဖစ် သည် Noun Noun Verb Post-positional Marker