SlideShare une entreprise Scribd logo
1  sur  16
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE


A Moses MT engine for legal
translation

By Joël Sigling
Joël Sigling
                                      Director



a Moses MT engine for
   legal translation
  Modern technology in a traditional sector
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE
Monte Carlo, 25 March 2012
AVB Translations background

•   Amstelveens Vertaalburo: founded 1972 – traditional, high-quality agency

•   Translation World: founded 2002, tech-savvy all-round player

•   Merger in 2010 >> AVB Translations: premium brand with strong tech focus

•   Top 5 player in The Netherlands, 2011 turnover € 4.6 million

•   Core business: general translations – legal, financial, technical, …
    NO software localization (yet!)
History of MT interest

•   Member of TAUS since 2008, 1st round table Amsterdam

•   Visited TAUS User Conferences in US since 2009

•   Sense of urgency developed, merger distraction 2010

•   Action in 2011 after merger

•   2011: choice for Dutch <> English legal (not IT-related!) domain engine

•   Why SMT, why Moses? Quicker, cheaper, similar quality (shows research)
Why legal domain MT engine?

•   Legal translations about approx. 40% of AVB business, 80% Dutch <>English

•   Not the obvious choice: people said MT wouldn’t work for legal: sentences
    too long, material too intricate

•   Statistical MT suited to non-stylistic materials: eg legal

•   If this works, we can make MT happen for all other domains
MT engine objectives

•   Increased productivity, no BLEU % target, but tangible, practical results.
    How much extra can a translator do when compared to HT?

•   Tool to offer usable quality with very quick turnarounds for high volume
    (typical “Friday afternoon lawyer requests”)

•   Becoming an MT front runner in the non-localization sector for Dutch
    (5th language in Europe after FIGS)
Developing the Moses engine

•   Choice between in-house and external development
     • In-house: control, developing expertise, lower long-term cost
     • External: lower initial cost, much more expertise > best for now

•   Our pre-requisites for development option
     • ownership and free access to engine
     • assurance data will not be used or copied by builder
     • Acceptable costs for development & usage
     • skilled partner > AsiaOnline, CrossLang, Pangeanic, LetsMT,
        SmartMate??

•   CrossLang > all of the above, closest to our office, independent
What we needed
•   Large quantities of high-quality translation data

     •   Aligning existing high-quality legal translations (took longest to prepare)
     •   Existing legal TMs
     •   Going forward: company-/industry-specific terminology

•   Ways to measure gains

     •   Not just automated evaluation % increase, but also tangible
         improvements > we are entrepreneurs, not scientists
     •   CrossLang automated assessment tool (TER, BLEU, NIST, METEOR)
     •   Manual assessment: eg. how many hours for post-editing 10,000 words?
Input data

•   Highest quality AVB Dutch <>English legal translations: approx.
    700k words per language. Predominantly civil law.

•   Not fully reviewed AVB TM, still high-quality: approx. 10 mi.
    words per language. Predominantly civil law.

•   Legal translations harvested by CrossLang, more diverse legal
    material: 7 mi. words per language
CrossLang automated test results

•   Best results from AVB + harvested data, AVB data weighted extra

•   Results particularly good in civil law domain (bulk of AVB input
    data)

•   Results improved dramatically for other legal domains by adding
    harvested data
AVB results in practice

•   Test done in CrossLang production assessment tool: productivity 5%
    higher for post-editing than human output (human output in this
    case very high >1000 w p/h, PE even higer)
AVB results in practice

•   Live rush translations done in past two weeks:

     •   1,500 word trial done for law firm needing high volume in
         very short time. Post-edited in 75 minutes. Customer happy
         with quality/price ratio.
     •   25,000 words in two days with moderate PE effort by two
         post-editors. Quality estimate 80-90% of human translation.
     •   4,500 words in 3 hours with almost full PE effort by one
         post-editor. Quality estimate >90% of human translation
     •   15,000 words in one day, done by two post-editors. Quality
         estimate 80-90% of human translation
AVB results in practice

•   Test and live project show great potential in two areas:

     •   Producing usable translations very quickly and at 50-60% of
         normal translation cost. Margins are similar to normal
         translation, but likely to improve!

     •   Higher productivity, ie lower production cost and
         increased margins.
CrossLang Gateway benefits
•   Standard Moses engine offers no high-level functions
     • Only plain text files, always sentence by sentence, experimental
        recasing, experimental tag handling

•   CrossLang Gateway offers Java service layer (not wrapper scripts)
     • Most common file formats: Word, XML, XLIFF,
     • Adjustable text segmentation
     • Hardened, aligment-based tag handling
     • Advanced recasing tool based on alignment data
     • Named entity recognition & (re)tokenization
     • Terminology checking and replacement

Gateway features crucial to processing our material properly
Conclusions

•   Developing a good engine is not an “out of the box” task

•   Sufficient high-quality data is necessary for good results

•   Results are very promising, our objectives can be achieved

•   Working with a value added partner is recommended

•   Need to integrate MT solution in translation workflow
    apparent
Phone:     +31 20 645.66.10
Mobile:    +31 625.025.475
E-mail:    joel.sigling@avb.nl
Twitter:   @JoelAVB
Adres:     Ouderkerkerlaan 50
           1185 AD Amstelveen
           The Netherlands
Website:   www.avb.nl

Contenu connexe

Similaire à TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 March 2012

Similaire à TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 March 2012 (20)

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Seattle, Two Practical Use Cas...
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...
Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...
Is Your Enterprise “fire-fighting” translation issues? Optimize the process w...
 
Using Checker Software for Clear, Concise and Consistent Content | Berry Braster
Using Checker Software for Clear, Concise and Consistent Content | Berry BrasterUsing Checker Software for Clear, Concise and Consistent Content | Berry Braster
Using Checker Software for Clear, Concise and Consistent Content | Berry Braster
 
LavaCon 2015: Efficient Translation Management - 5 Specific Metrics That Wil...
LavaCon 2015:  Efficient Translation Management - 5 Specific Metrics That Wil...LavaCon 2015:  Efficient Translation Management - 5 Specific Metrics That Wil...
LavaCon 2015: Efficient Translation Management - 5 Specific Metrics That Wil...
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies
 
Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32Localization and DITA: What you Need to Know - LocWorld32
Localization and DITA: What you Need to Know - LocWorld32
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive Translation
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translation
 
Connected and continuous localization systems for content management systems
Connected and continuous localization systems for content management systemsConnected and continuous localization systems for content management systems
Connected and continuous localization systems for content management systems
 
Localizing Prestashop E-Commerce Site with Wordfast
Localizing Prestashop E-Commerce Site with WordfastLocalizing Prestashop E-Commerce Site with Wordfast
Localizing Prestashop E-Commerce Site with Wordfast
 
Translation management for life sciences
Translation management for life sciencesTranslation management for life sciences
Translation management for life sciences
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013
 
Karnov Super power your search with Text Analytics - Findability Day 2014
Karnov Super power your search with Text Analytics - Findability Day 2014Karnov Super power your search with Text Analytics - Findability Day 2014
Karnov Super power your search with Text Analytics - Findability Day 2014
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...
 
Translation and Transcreation Workshop
Translation and Transcreation Workshop Translation and Transcreation Workshop
Translation and Transcreation Workshop
 
Opening the Black Box of Software Localization
Opening the Black Box of Software LocalizationOpening the Black Box of Software Localization
Opening the Black Box of Software Localization
 
MiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit MichiganMiTiN 2013 Keynote in Detroit Michigan
MiTiN 2013 Keynote in Detroit Michigan
 

Plus de TAUS - The Language Data Network

Plus de TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
How Existing Quality Models Get Challenged, by Katka Gasova (Moravia)
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 March 2012

  • 1. TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE A Moses MT engine for legal translation By Joël Sigling
  • 2. Joël Sigling Director a Moses MT engine for legal translation Modern technology in a traditional sector TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE Monte Carlo, 25 March 2012
  • 3. AVB Translations background • Amstelveens Vertaalburo: founded 1972 – traditional, high-quality agency • Translation World: founded 2002, tech-savvy all-round player • Merger in 2010 >> AVB Translations: premium brand with strong tech focus • Top 5 player in The Netherlands, 2011 turnover € 4.6 million • Core business: general translations – legal, financial, technical, … NO software localization (yet!)
  • 4. History of MT interest • Member of TAUS since 2008, 1st round table Amsterdam • Visited TAUS User Conferences in US since 2009 • Sense of urgency developed, merger distraction 2010 • Action in 2011 after merger • 2011: choice for Dutch <> English legal (not IT-related!) domain engine • Why SMT, why Moses? Quicker, cheaper, similar quality (shows research)
  • 5. Why legal domain MT engine? • Legal translations about approx. 40% of AVB business, 80% Dutch <>English • Not the obvious choice: people said MT wouldn’t work for legal: sentences too long, material too intricate • Statistical MT suited to non-stylistic materials: eg legal • If this works, we can make MT happen for all other domains
  • 6. MT engine objectives • Increased productivity, no BLEU % target, but tangible, practical results. How much extra can a translator do when compared to HT? • Tool to offer usable quality with very quick turnarounds for high volume (typical “Friday afternoon lawyer requests”) • Becoming an MT front runner in the non-localization sector for Dutch (5th language in Europe after FIGS)
  • 7. Developing the Moses engine • Choice between in-house and external development • In-house: control, developing expertise, lower long-term cost • External: lower initial cost, much more expertise > best for now • Our pre-requisites for development option • ownership and free access to engine • assurance data will not be used or copied by builder • Acceptable costs for development & usage • skilled partner > AsiaOnline, CrossLang, Pangeanic, LetsMT, SmartMate?? • CrossLang > all of the above, closest to our office, independent
  • 8. What we needed • Large quantities of high-quality translation data • Aligning existing high-quality legal translations (took longest to prepare) • Existing legal TMs • Going forward: company-/industry-specific terminology • Ways to measure gains • Not just automated evaluation % increase, but also tangible improvements > we are entrepreneurs, not scientists • CrossLang automated assessment tool (TER, BLEU, NIST, METEOR) • Manual assessment: eg. how many hours for post-editing 10,000 words?
  • 9. Input data • Highest quality AVB Dutch <>English legal translations: approx. 700k words per language. Predominantly civil law. • Not fully reviewed AVB TM, still high-quality: approx. 10 mi. words per language. Predominantly civil law. • Legal translations harvested by CrossLang, more diverse legal material: 7 mi. words per language
  • 10. CrossLang automated test results • Best results from AVB + harvested data, AVB data weighted extra • Results particularly good in civil law domain (bulk of AVB input data) • Results improved dramatically for other legal domains by adding harvested data
  • 11. AVB results in practice • Test done in CrossLang production assessment tool: productivity 5% higher for post-editing than human output (human output in this case very high >1000 w p/h, PE even higer)
  • 12. AVB results in practice • Live rush translations done in past two weeks: • 1,500 word trial done for law firm needing high volume in very short time. Post-edited in 75 minutes. Customer happy with quality/price ratio. • 25,000 words in two days with moderate PE effort by two post-editors. Quality estimate 80-90% of human translation. • 4,500 words in 3 hours with almost full PE effort by one post-editor. Quality estimate >90% of human translation • 15,000 words in one day, done by two post-editors. Quality estimate 80-90% of human translation
  • 13. AVB results in practice • Test and live project show great potential in two areas: • Producing usable translations very quickly and at 50-60% of normal translation cost. Margins are similar to normal translation, but likely to improve! • Higher productivity, ie lower production cost and increased margins.
  • 14. CrossLang Gateway benefits • Standard Moses engine offers no high-level functions • Only plain text files, always sentence by sentence, experimental recasing, experimental tag handling • CrossLang Gateway offers Java service layer (not wrapper scripts) • Most common file formats: Word, XML, XLIFF, • Adjustable text segmentation • Hardened, aligment-based tag handling • Advanced recasing tool based on alignment data • Named entity recognition & (re)tokenization • Terminology checking and replacement Gateway features crucial to processing our material properly
  • 15. Conclusions • Developing a good engine is not an “out of the box” task • Sufficient high-quality data is necessary for good results • Results are very promising, our objectives can be achieved • Working with a value added partner is recommended • Need to integrate MT solution in translation workflow apparent
  • 16. Phone: +31 20 645.66.10 Mobile: +31 625.025.475 E-mail: joel.sigling@avb.nl Twitter: @JoelAVB Adres: Ouderkerkerlaan 50 1185 AD Amstelveen The Netherlands Website: www.avb.nl