SlideShare une entreprise Scribd logo
1  sur  11
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE


Moses: The Trusted Translations
Experience
15:30-15:45
Monday 4 June


Gustavo Lucardi
Trusted Translations
Moses:
   The Trusted Translations Experience



TAUS Open Source Machine Translation
Pre Conference Workshops
Localization World Paris
June 4, 2012

Gustavo Lucardi
COO Trusted Translations, Inc.
@glucardi
Moses: The Trusted Translations Experience



  Moses from the Point of View of an LSP
   • What Exactly are We Doing?
   • Unexpected Experiences and Issues
   • Evaluating Complementary Tools
   • Our Next Steps with Moses
   • Our Conclusions about Moses up to Now
   • Looking Back What Would We Do Differently
   • What other LSPs could learn from Our Experience

                             TAUS Open Source Machine Translation
                                  Pre Conference Workshops
                                 Localization World Paris 2012
Moses: The Trusted Translations Experience



  What exactly are we doing?
   • The real answer is LEARNING MT ASAP
   • If we are asked to choose, we prefer Open Source
   • We built two Moses engines:
      ▫ Legal English to Spanish (many clients) and Technical English to Spanish (1 client)

      ▫ We are re-training those engines with our post-editors feedback

      ▫ But, we are also re-building those engines with different Moses Customizations

   • Evaluating Complementary Tools
      ▫ Like SymEval among others

   • Testing other MT solutions
      ▫ Pangea MT (Pangeanic), DoMY (Translation Precision Tools), Smart MATE (Applied
        Languages), Language Studio (Asia Online), Tauyou, LetsMT, and others

                              TAUS Open Source Machine Translation
                                   Pre Conference Workshops
                                  Localization World Paris 2012
Moses: The Trusted Translations Experience



  Unexpected Experiences and Issues
   • The lack of information and documented procedures on how post-
     editing should be done (There is a very helpful TAUS Paper about that
     and we also went to the TAUS Pre Conference this morning)
   • The difficulty of measuring human post-editing quality and
     productivity (SymEval was helpful for that)
   • The difficulty to measure the quality of an Engine using BLEU (For
     example an Engine with a BLEU of 70 can have a worse output than
     Engines with a BLEU of 40)
   • The difficulty of aligning existing Parallel DATA without human
     intervention
   • The fear of translators to be displaced by MT
   • MT focus is improving Quality and Time, not Costs.
   • MT + PE is not enough = We are doing MT + PE + E and/or P

                             TAUS Open Source Machine Translation
                                  Pre Conference Workshops
                                 Localization World Paris 2012
Moses: The Trusted Translations Experience



  Evaluating Complementary Tools
   • We want to mention SymEval. It has 3 components: Eval, Extract and Diff.
   • Eval is used with TMX files and it produces an XML file highlighting the
     differences between the two versions.
   • Also the Project Score can be seen as the percentage of the machine translation
     that was used (Two identical files would have a score of 100)
   • We notice that it can help judging either the quality of the machine translation
     or the quality of the post-editing. In other words:
      ▫   If you have a good Engine you could use Eval to test your Post-editor

      ▫   If you have a good Post-Editor you could use Eval to test your Engines

   • SymEval is not very user-friendly, and there is not enough support
     documentation, For example, in the help page it says that the Eval tool only
     supports XLIFF files, but it actually does support also TMX
   • We find a more user-friendly tool in a Memory Server that we were testing, but
     we need to make the Memory Server speak with Moses yet

                                  TAUS Open Source Machine Translation
                                       Pre Conference Workshops
                                      Localization World Paris 2012
Moses: The Trusted Translations Experience



  Our Next steps with Moses
   • Scripts to Train (Automate through scripts the pre-processing of DATA to train Moses
     engines: Take out Placeables like Numbers, Acronyms, XML Tags, E-mail addresses and
     URLs / Remove Short Segments and Long Segments / Spell check on Source and Target
     / Remove segments where target and source have different structures)

   • Scripts to Translate (Automate through scripts the pre and post-processing for
     translations with Moses engines: Go from TMX to TXT and from TXT to TMX)

   • Plug-in for Moses interaction with CAT TOOLS (Create or find PLUG-INS to use Moses
     directly through CAT TOOLS, like the ones for Okapi or Globalsight. This could allow us
     to avoid to go to and from TMX and TXT. Plug-ins for Moses like the plug-ins for Trados
     to connect with FreeTranslation, Systran and others)

   • Moses GUI (Create or find a user interface to allow the less tech savvy to interact directly
     with Moses)

   • Interaction of Moses with Cloud Memory Server and Workbench Online

   • Add the possibility that the Post Editor chose different MT Engines for each segment

                                TAUS Open Source Machine Translation
                                     Pre Conference Workshops
                                    Localization World Paris 2012
Moses: The Trusted Translations Experience



  Our Conclusions about Moses up to Now
   • Moses has great potential, but it is not yet optimized for LSPs
   • Its interface is not user-friendly (in fact there is no GUI yet)
   • There are no reliable sources of information for certain niches (a
     lot of work is still needed to obtain a reliable corpus in some
     niches with the tools that exist at the moment)
   • We still have to wait for the development of the new features
     that LSPs require (glossaries, blacklist words, labels)
   • It is necessary to optimize the internal processes of the LSPs to
     adapt to the needs of Moses (Management of Memories, product
     names and trademarks, languages)
   • It's better to work with an engine for a specific client, than with
     one for a specific industry or area of specialization

                             TAUS Open Source Machine Translation
                                  Pre Conference Workshops
                                 Localization World Paris 2012
Moses: The Trusted Translations Experience



  Looking Back What Would We Do Differently
   • Start using a Memory Server before
   • A Memory Server is the best solution to be sure that your
     company always properly manages translation memories
   • Memories are:
       ▫ The basic input for an LSP to customize an Engine
       ▫ The added value created by an LSP for a certain client or a domain

   • Start testing with Moses one year early (We started in 2010
     when we should have started in 2009, but it’s never too late!)
   • Dedicate more resources to test all the other more mature
     implementations
   • Build Engines for Clients and not for Domains (???)

                             TAUS Open Source Machine Translation
                                  Pre Conference Workshops
                                 Localization World Paris 2012
Moses: The Trusted Translations Experience



  What other LSPs Could Learn from Our Experience
   • LSPs can only profit from running their own MT engines if they
     have a recurrent customer with enough data volume to build an
     engine and perfect engines on a specific domain over time
   • General (non-specific) MTs (i.e., Google Translate, Microsoft)
     will provide better results for one time jobs or for recurrent
     clients with low data volume
   • The effort put to overcome the problems found during the
     deployment stage on integrated MT customizations was pretty
     similar to the effort needed to set up Moses
   • The current proprietary MTs customizations that we tested have
     yet to travel some distance in order to offer a turnkey solution
     for an LSP (We prefer Linux to Windows, and for now we prefer
     Ubuntu to RedHat)

                             TAUS Open Source Machine Translation
                                  Pre Conference Workshops
                                 Localization World Paris 2012
Thank you!




Gustavo Lucardi
@glucardi
glucardi@trustedtranslations.com
http://www.trustedtranslations.com
http://translation-blog.trustedtranslations.com

Contenu connexe

Tendances

Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly
 
Welcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology InitiativeWelcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology Initiative
Basil Bibi
 

Tendances (17)

Learning to code in 2020
Learning to code in 2020Learning to code in 2020
Learning to code in 2020
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Welcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology InitiativeWelcome to the Brixton Library Technology Initiative
Welcome to the Brixton Library Technology Initiative
 
Levelling up in open source
Levelling up in open sourceLevelling up in open source
Levelling up in open source
 
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
 
Lipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library SystemsLipstick on a Pig: Integrated Library Systems
Lipstick on a Pig: Integrated Library Systems
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and search
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
 
Java Memory Consistency Model - concepts and context
Java Memory Consistency Model - concepts and contextJava Memory Consistency Model - concepts and context
Java Memory Consistency Model - concepts and context
 
natural language processing help at myassignmenthelp.net
natural language processing  help at myassignmenthelp.netnatural language processing  help at myassignmenthelp.net
natural language processing help at myassignmenthelp.net
 
Chatbots and Deep Learning
Chatbots and Deep LearningChatbots and Deep Learning
Chatbots and Deep Learning
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
 

En vedette

En vedette (16)

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, The Landscape, Rahzeb...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, The Landscape, Rahzeb...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, The Landscape, Rahzeb...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, The Landscape, Rahzeb...
 
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMTTAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
 
TAUS Roundtable Moscow, Planning for an Uncertain Future, Jaap van der Meer, ...
TAUS Roundtable Moscow, Planning for an Uncertain Future, Jaap van der Meer, ...TAUS Roundtable Moscow, Planning for an Uncertain Future, Jaap van der Meer, ...
TAUS Roundtable Moscow, Planning for an Uncertain Future, Jaap van der Meer, ...
 
TAUS Machine Translation Showcase, Seamless Globalization with Distributed Cr...
TAUS Machine Translation Showcase, Seamless Globalization with Distributed Cr...TAUS Machine Translation Showcase, Seamless Globalization with Distributed Cr...
TAUS Machine Translation Showcase, Seamless Globalization with Distributed Cr...
 
TaaS Workshop 2014, TermWiki Pro, Uwe Muegge & Carl Yao, CSOFT International
TaaS Workshop 2014, TermWiki Pro, Uwe Muegge & Carl Yao, CSOFT InternationalTaaS Workshop 2014, TermWiki Pro, Uwe Muegge & Carl Yao, CSOFT International
TaaS Workshop 2014, TermWiki Pro, Uwe Muegge & Carl Yao, CSOFT International
 
Creative Content and Quality Testing in Mobile Applications -Suzanne Marie Fr...
Creative Content and Quality Testing in Mobile Applications -Suzanne Marie Fr...Creative Content and Quality Testing in Mobile Applications -Suzanne Marie Fr...
Creative Content and Quality Testing in Mobile Applications -Suzanne Marie Fr...
 
The Future does not need Translators. Or does it?- Jean Senellart
The Future does not need Translators. Or does it?- Jean SenellartThe Future does not need Translators. Or does it?- Jean Senellart
The Future does not need Translators. Or does it?- Jean Senellart
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Full Service Enterpri...
 
TAUS Moses Roundtable, Prague, 11 September 2013
TAUS Moses Roundtable, Prague, 11 September 2013TAUS Moses Roundtable, Prague, 11 September 2013
TAUS Moses Roundtable, Prague, 11 September 2013
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...
 
TAUS MT SHOWCASE, Moses in the Mix. A Technology Agnostic Approach to a Winni...
TAUS MT SHOWCASE, Moses in the Mix. A Technology Agnostic Approach to a Winni...TAUS MT SHOWCASE, Moses in the Mix. A Technology Agnostic Approach to a Winni...
TAUS MT SHOWCASE, Moses in the Mix. A Technology Agnostic Approach to a Winni...
 
Topic 5: DQF Integrations and Use Cases
Topic 5: DQF Integrations and Use CasesTopic 5: DQF Integrations and Use Cases
Topic 5: DQF Integrations and Use Cases
 
Use Cases of DQF
Use Cases of DQFUse Cases of DQF
Use Cases of DQF
 
Disruptive Innovation as a result of Convergence between Publishing and Tran...
	Disruptive Innovation as a result of Convergence between Publishing and Tran...	Disruptive Innovation as a result of Convergence between Publishing and Tran...
Disruptive Innovation as a result of Convergence between Publishing and Tran...
 
TAUS Membership Program 2017
TAUS Membership Program 2017TAUS Membership Program 2017
TAUS Membership Program 2017
 
The Future Does Not Need Translators. Or Does It?
The Future Does Not Need Translators. Or Does It?The Future Does Not Need Translators. Or Does It?
The Future Does Not Need Translators. Or Does It?
 

Similaire à TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Gustavo Lucardi, Trusted Translations, 4 June 2012

Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
Iván Montes
 
soft ware solutions for radiologists.pptx
soft ware solutions for radiologists.pptxsoft ware solutions for radiologists.pptx
soft ware solutions for radiologists.pptx
junkfiles
 

Similaire à TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Gustavo Lucardi, Trusted Translations, 4 June 2012 (20)

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Gustavo Lucardi, Trust...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Gustavo Lucardi, Trust...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Gustavo Lucardi, Trust...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Gustavo Lucardi, Trust...
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
Translation technology plugging the gaps_ecpd
Translation technology plugging the gaps_ecpdTranslation technology plugging the gaps_ecpd
Translation technology plugging the gaps_ecpd
 
TAUS Roundtable Moscow, Translation Automation Going Cloud- The New Landscape...
TAUS Roundtable Moscow, Translation Automation Going Cloud- The New Landscape...TAUS Roundtable Moscow, Translation Automation Going Cloud- The New Landscape...
TAUS Roundtable Moscow, Translation Automation Going Cloud- The New Landscape...
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
SDL Trados Studio 2014... what's new?
SDL Trados Studio 2014... what's new?SDL Trados Studio 2014... what's new?
SDL Trados Studio 2014... what's new?
 
DevOps in the Real World
DevOps in the Real WorldDevOps in the Real World
DevOps in the Real World
 
Opening the Black Box of Software Localization
Opening the Black Box of Software LocalizationOpening the Black Box of Software Localization
Opening the Black Box of Software Localization
 
Book_ A Project based approach CHAPTER 1 summary.pptx
Book_ A Project based approach CHAPTER 1 summary.pptxBook_ A Project based approach CHAPTER 1 summary.pptx
Book_ A Project based approach CHAPTER 1 summary.pptx
 
Handling 1 Billion Requests/hr with Minimal Latency Using Docker
Handling 1 Billion Requests/hr with Minimal Latency Using DockerHandling 1 Billion Requests/hr with Minimal Latency Using Docker
Handling 1 Billion Requests/hr with Minimal Latency Using Docker
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
Translation Trends for 2015
Translation Trends for 2015Translation Trends for 2015
Translation Trends for 2015
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013
 
Ignite DevOps 2017 @ SEP
Ignite DevOps 2017 @ SEPIgnite DevOps 2017 @ SEP
Ignite DevOps 2017 @ SEP
 
Translation and Transcreation Workshop
Translation and Transcreation Workshop Translation and Transcreation Workshop
Translation and Transcreation Workshop
 
Programming Languages #devcon2013
Programming Languages #devcon2013Programming Languages #devcon2013
Programming Languages #devcon2013
 
Performing successful migrations to the microsoft cloud
Performing successful migrations to the microsoft cloudPerforming successful migrations to the microsoft cloud
Performing successful migrations to the microsoft cloud
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 ...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 ...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 ...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 ...
 
Cloud-based Translation Memory Systems
Cloud-based Translation Memory SystemsCloud-based Translation Memory Systems
Cloud-based Translation Memory Systems
 
soft ware solutions for radiologists.pptx
soft ware solutions for radiologists.pptxsoft ware solutions for radiologists.pptx
soft ware solutions for radiologists.pptx
 

Plus de TAUS - The Language Data Network

Plus de TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Gustavo Lucardi, Trusted Translations, 4 June 2012

  • 1. TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE Moses: The Trusted Translations Experience 15:30-15:45 Monday 4 June Gustavo Lucardi Trusted Translations
  • 2. Moses: The Trusted Translations Experience TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris June 4, 2012 Gustavo Lucardi COO Trusted Translations, Inc. @glucardi
  • 3. Moses: The Trusted Translations Experience Moses from the Point of View of an LSP • What Exactly are We Doing? • Unexpected Experiences and Issues • Evaluating Complementary Tools • Our Next Steps with Moses • Our Conclusions about Moses up to Now • Looking Back What Would We Do Differently • What other LSPs could learn from Our Experience TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012
  • 4. Moses: The Trusted Translations Experience What exactly are we doing? • The real answer is LEARNING MT ASAP • If we are asked to choose, we prefer Open Source • We built two Moses engines: ▫ Legal English to Spanish (many clients) and Technical English to Spanish (1 client) ▫ We are re-training those engines with our post-editors feedback ▫ But, we are also re-building those engines with different Moses Customizations • Evaluating Complementary Tools ▫ Like SymEval among others • Testing other MT solutions ▫ Pangea MT (Pangeanic), DoMY (Translation Precision Tools), Smart MATE (Applied Languages), Language Studio (Asia Online), Tauyou, LetsMT, and others TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012
  • 5. Moses: The Trusted Translations Experience Unexpected Experiences and Issues • The lack of information and documented procedures on how post- editing should be done (There is a very helpful TAUS Paper about that and we also went to the TAUS Pre Conference this morning) • The difficulty of measuring human post-editing quality and productivity (SymEval was helpful for that) • The difficulty to measure the quality of an Engine using BLEU (For example an Engine with a BLEU of 70 can have a worse output than Engines with a BLEU of 40) • The difficulty of aligning existing Parallel DATA without human intervention • The fear of translators to be displaced by MT • MT focus is improving Quality and Time, not Costs. • MT + PE is not enough = We are doing MT + PE + E and/or P TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012
  • 6. Moses: The Trusted Translations Experience Evaluating Complementary Tools • We want to mention SymEval. It has 3 components: Eval, Extract and Diff. • Eval is used with TMX files and it produces an XML file highlighting the differences between the two versions. • Also the Project Score can be seen as the percentage of the machine translation that was used (Two identical files would have a score of 100) • We notice that it can help judging either the quality of the machine translation or the quality of the post-editing. In other words: ▫ If you have a good Engine you could use Eval to test your Post-editor ▫ If you have a good Post-Editor you could use Eval to test your Engines • SymEval is not very user-friendly, and there is not enough support documentation, For example, in the help page it says that the Eval tool only supports XLIFF files, but it actually does support also TMX • We find a more user-friendly tool in a Memory Server that we were testing, but we need to make the Memory Server speak with Moses yet TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012
  • 7. Moses: The Trusted Translations Experience Our Next steps with Moses • Scripts to Train (Automate through scripts the pre-processing of DATA to train Moses engines: Take out Placeables like Numbers, Acronyms, XML Tags, E-mail addresses and URLs / Remove Short Segments and Long Segments / Spell check on Source and Target / Remove segments where target and source have different structures) • Scripts to Translate (Automate through scripts the pre and post-processing for translations with Moses engines: Go from TMX to TXT and from TXT to TMX) • Plug-in for Moses interaction with CAT TOOLS (Create or find PLUG-INS to use Moses directly through CAT TOOLS, like the ones for Okapi or Globalsight. This could allow us to avoid to go to and from TMX and TXT. Plug-ins for Moses like the plug-ins for Trados to connect with FreeTranslation, Systran and others) • Moses GUI (Create or find a user interface to allow the less tech savvy to interact directly with Moses) • Interaction of Moses with Cloud Memory Server and Workbench Online • Add the possibility that the Post Editor chose different MT Engines for each segment TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012
  • 8. Moses: The Trusted Translations Experience Our Conclusions about Moses up to Now • Moses has great potential, but it is not yet optimized for LSPs • Its interface is not user-friendly (in fact there is no GUI yet) • There are no reliable sources of information for certain niches (a lot of work is still needed to obtain a reliable corpus in some niches with the tools that exist at the moment) • We still have to wait for the development of the new features that LSPs require (glossaries, blacklist words, labels) • It is necessary to optimize the internal processes of the LSPs to adapt to the needs of Moses (Management of Memories, product names and trademarks, languages) • It's better to work with an engine for a specific client, than with one for a specific industry or area of specialization TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012
  • 9. Moses: The Trusted Translations Experience Looking Back What Would We Do Differently • Start using a Memory Server before • A Memory Server is the best solution to be sure that your company always properly manages translation memories • Memories are: ▫ The basic input for an LSP to customize an Engine ▫ The added value created by an LSP for a certain client or a domain • Start testing with Moses one year early (We started in 2010 when we should have started in 2009, but it’s never too late!) • Dedicate more resources to test all the other more mature implementations • Build Engines for Clients and not for Domains (???) TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012
  • 10. Moses: The Trusted Translations Experience What other LSPs Could Learn from Our Experience • LSPs can only profit from running their own MT engines if they have a recurrent customer with enough data volume to build an engine and perfect engines on a specific domain over time • General (non-specific) MTs (i.e., Google Translate, Microsoft) will provide better results for one time jobs or for recurrent clients with low data volume • The effort put to overcome the problems found during the deployment stage on integrated MT customizations was pretty similar to the effort needed to set up Moses • The current proprietary MTs customizations that we tested have yet to travel some distance in order to offer a turnkey solution for an LSP (We prefer Linux to Windows, and for now we prefer Ubuntu to RedHat) TAUS Open Source Machine Translation Pre Conference Workshops Localization World Paris 2012