SlideShare a Scribd company logo
1 of 25
No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
KantanMT.Com
NO HARDWARE. NO SOFTWARE. NO HASSLE MT
Tony O’Dowd
Founder & Chief Architect
New Breakthroughs in Machine
Translation Technology
What we aim to cover today?
What is KantanMT.com?
Challenges of the L10N Industry
 Making the right Project Management decisions
 Going beyond the baseline of MT quality
Conclusions
15 minutes
What is KantanMT.com?
Statistical MT System
 Cloud-based =
 Highly scalable
 Inexpensive to operate
 Quick to deploy
Our Vision
 To put Machine Translation:
 Customization
 Improvement
 Deployment
 …into your hands
Active KantanMT Engines
6,191
Training Words Uploaded
28,243,234,615
Member WordsTranslated
427,526,741
Fully Operational 15 months
Initial Steps of any project are:
 Determine Scope
 How long will it take?
 How much will it cost?
 What is my margin?
 Determine resources
 How many Translators will I need?
Introducing KantanAnalytics™
 …think Fuzzy-Match report and you’ve got it in one!
Challenge #1
How can Project Managers ‘manage’ Post-
Editing Projects?
KantanAnalytics™
Kantan TotalRecall – Advanced TM
% of TM hits in this job
KantanMT – automated translations
% of automated translations for this job
Range of QE Scores
QE range defined to match existing fuzzy match ranges used by
L10N industry
Quality Estimation Scores
Segment level QE scores – akin to fuzzy match scores
Word Counts – Project Stats
Can be used to develop Project TimeLine and Tiered Pricing Model
for Post-Editing Projects
Placeholder & Tag Counts
Used by PM for complexity sur-charges
KantanAnalytics embeds QE scores
into
 TRADOS Studio
 MemoQ
 XLIFF
KantanAnalytics™
Helping PMs make the right business
decisions!
KantanAnalytics™ - Helping PMs make the right decisions
Challenge #2: Going beyond the baseline and developing
production ready MT!
Easy to build 1st baseline engine
 Aggregate Training Data – TM, Mono, Stock, Terminology
 Use Cloud-based platform, like KantanMT.com
Real Challenge:
 How do these platforms go beyond the baseline engine and achieve
higher levels of production quality
Introducing Kantan BuildAnalytics
 Data analytics and visualisation providing insights into the
customisation of SMT engines.
Kantan BuildAnalytics™
Rapidly develop production ready engines
 Summary Report
 Training Rejects Reports
 F-Measure Analysis
 BLEU Analysis
 TER Analysis
 GAP Analysis
 Timeline Report
 Deep Tuning
Kantan BuildAnalytics™
F-Measure Score
Measures word recall & precision of KantanMT engines
Distributions
Provides distribution of F-Measure scores across all reference
translations
Kantan Insight™
Holistic analysis of score and advice on how to improve this for
KantanMT engines
Detailed Analysis
Segment level F-Measure analysis to help SMT Developers
improve training material
Kantan BuildAnalytics™
Detailed Reports for: F-Measure, BLEU and TER
Kantan BuildAnalytics™
Gap Analysis – quickest way of improving fluency
Kantan BuildAnalytics™
Training Rejects Report – Improve training data rapidly
Kantan BuildAnalytics™
Timeline – Tracks history of KantanMT engines
Kantan BuildAnalytics™ - Rapid MT Customisation
bmmt GmbH and KantanMT:
The Real-World Use
of Machine Translation
Maxim Khalilov
Technical Lead
bmmt GmbH
maxim.khalilov@machine-translation.eu
KantanMT webinar
April 10, 2014
MT in industry: context and rationale
The combination of these two technologies, well-established TM and cutting-edge MT, plus
post-editing allows the creation of a high-quality translation that reads just as well as a
“classically” produced translation.
MT in industry: what about cost?
The cost structure changes when machine translation is integrated into the translation pipeline.
When machine translation is adopted, the data preparation and quality assurance (editing) costs rise
whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is
reduced dramatically as illustrated.
MT case study
 Customer: big German machine manufacturer
 Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.
 Settings: the files were processed through Trados Studio 2011.
 Implementation: KantanMT
 Description: Roughly 7,000 words came from TM as high matches. The remainder went through
MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the
same level of quality as in an all-human translation.
 Training material: Our customer had not worked in this language combination before, so there was
no TM to go on. But we knew that the English authors based their work on material that the
customer had previously translated from German into English. Thus we reversed the language
direction of the TM and trained a customer-specific engine with this TM.
 Results: As a result, 44,000 words were post-edited to a final quality level that the customer was
very happy with.
 Cost savings > 30%.
MT: benefits of KantanMT solution
 Fully automated system training
 One-click system customization
 Automatic data pre-processing
 Fully automated translation
 Automatic pre- and post-processing
 Quality assessment
 KantanWatch
 Gap Analysis
 Reject Report
 No worry about maintenance and infrastructure
MT: benefits of KantanMT solution
 Transparent file format conversion
 Training material conversion: TM conversion, monolingual material
 Documents to translate: TMS format into MTable format
 SDLXliff
 Smooth terminology integration
 Consistent terminology
 Tag handling and mark-up transfer
Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8
SWord 9</g>
Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g
id="16481">Number</g>
bmmt GmbH
 Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology
solutions
 Three operations centers in Germany: Munich, Berlin and Stuttgart
 bmmt GmbH heavily relies on KantanMT services from 2013
 Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT
 Types of documents: workshop texts, product catalogues & other highly repetitive information documents
 Primary source language: German
 Integration: SDL Trados, SDL WorldServer and others
 Find more: www.machine-translation.eu
Berlin
Alt-Moabit 92
10559 Berlin
Phone: +49 30-3117505-15
Fax: +49 30-3117505-20
Munich
Bernhard-Wicki-Straße 5
80636 Munich
Phone: +49 89 2000037-17
Fax: +49 89 2000037-11
Stuttgart
Ruppmannstraße 33b
70565 Stuttgart
Phone: +49 711 16646-66
Fax: +49 711 16646-50
bmmt GmbH
info@machine-translation.eu
Thank you
No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
Tony O’Dowd, tonyod@kantanmt.com
Maxim Khalilov, maxim.khalilov@machine-translation.eu
Speakers

More Related Content

What's hot

What's hot (8)

Gestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de BarcelonaGestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de Barcelona
 
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
 
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (stock engines, Jan 2019)
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
 
Lesson 1 introduction to programming
Lesson 1 introduction to programmingLesson 1 introduction to programming
Lesson 1 introduction to programming
 
CAN FD Stack Introduction & Related FAQ
CAN FD Stack Introduction & Related FAQCAN FD Stack Introduction & Related FAQ
CAN FD Stack Introduction & Related FAQ
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 

Viewers also liked

Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETL
ganblues
 

Viewers also liked (7)

Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETL
 
Git, Beginner to Advanced Survey
Git, Beginner to Advanced SurveyGit, Beginner to Advanced Survey
Git, Beginner to Advanced Survey
 
Apache HISE + Apache Camel
Apache HISE + Apache CamelApache HISE + Apache Camel
Apache HISE + Apache Camel
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 

Similar to New Breakthroughs in Machine Transation Technology

Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
kantanmt
 

Similar to New Breakthroughs in Machine Transation Technology (20)

Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive Translation
 
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMTTAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
 
KantanMT
KantanMT KantanMT
KantanMT
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivity
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
 
KantanMT Brochure
KantanMT BrochureKantanMT Brochure
KantanMT Brochure
 
KantanMT for Automotive
KantanMT for AutomotiveKantanMT for Automotive
KantanMT for Automotive
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
 
iMT Language Solutions
iMT Language SolutionsiMT Language Solutions
iMT Language Solutions
 
Gestión proyectos traducción en la Universitat Autònoma de Barcelona
Gestión proyectos traducción en la Universitat Autònoma de BarcelonaGestión proyectos traducción en la Universitat Autònoma de Barcelona
Gestión proyectos traducción en la Universitat Autònoma de Barcelona
 
Intento Enterprise MT Hub
Intento Enterprise MT HubIntento Enterprise MT Hub
Intento Enterprise MT Hub
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
 
KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowd
 
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
 

More from kantanmt

EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMT
kantanmt
 

More from kantanmt (20)

KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskas
 
Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Dimitar Shterionov - Part 2
 
Kantanfest: Laura Casanellas
Kantanfest: Laura CasanellasKantanfest: Laura Casanellas
Kantanfest: Laura Casanellas
 
Kantanfest: Dimitar Shterionov - Part 1
Kantanfest: Dimitar Shterionov - Part 1Kantanfest: Dimitar Shterionov - Part 1
Kantanfest: Dimitar Shterionov - Part 1
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Way
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answer
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translation
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translation
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translation
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up business
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translation
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMT
 
Breaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerceBreaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerce
 
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCUCloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
 
How to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EURHow to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EUR
 
How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014 How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

New Breakthroughs in Machine Transation Technology

  • 1. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar
  • 2. KantanMT.Com NO HARDWARE. NO SOFTWARE. NO HASSLE MT Tony O’Dowd Founder & Chief Architect New Breakthroughs in Machine Translation Technology
  • 3. What we aim to cover today? What is KantanMT.com? Challenges of the L10N Industry  Making the right Project Management decisions  Going beyond the baseline of MT quality Conclusions 15 minutes
  • 4. What is KantanMT.com? Statistical MT System  Cloud-based =  Highly scalable  Inexpensive to operate  Quick to deploy Our Vision  To put Machine Translation:  Customization  Improvement  Deployment  …into your hands Active KantanMT Engines 6,191 Training Words Uploaded 28,243,234,615 Member WordsTranslated 427,526,741 Fully Operational 15 months
  • 5. Initial Steps of any project are:  Determine Scope  How long will it take?  How much will it cost?  What is my margin?  Determine resources  How many Translators will I need? Introducing KantanAnalytics™  …think Fuzzy-Match report and you’ve got it in one! Challenge #1 How can Project Managers ‘manage’ Post- Editing Projects?
  • 6. KantanAnalytics™ Kantan TotalRecall – Advanced TM % of TM hits in this job KantanMT – automated translations % of automated translations for this job Range of QE Scores QE range defined to match existing fuzzy match ranges used by L10N industry Quality Estimation Scores Segment level QE scores – akin to fuzzy match scores Word Counts – Project Stats Can be used to develop Project TimeLine and Tiered Pricing Model for Post-Editing Projects Placeholder & Tag Counts Used by PM for complexity sur-charges KantanAnalytics embeds QE scores into  TRADOS Studio  MemoQ  XLIFF
  • 7. KantanAnalytics™ Helping PMs make the right business decisions!
  • 8. KantanAnalytics™ - Helping PMs make the right decisions
  • 9. Challenge #2: Going beyond the baseline and developing production ready MT! Easy to build 1st baseline engine  Aggregate Training Data – TM, Mono, Stock, Terminology  Use Cloud-based platform, like KantanMT.com Real Challenge:  How do these platforms go beyond the baseline engine and achieve higher levels of production quality Introducing Kantan BuildAnalytics  Data analytics and visualisation providing insights into the customisation of SMT engines.
  • 10. Kantan BuildAnalytics™ Rapidly develop production ready engines  Summary Report  Training Rejects Reports  F-Measure Analysis  BLEU Analysis  TER Analysis  GAP Analysis  Timeline Report  Deep Tuning
  • 11. Kantan BuildAnalytics™ F-Measure Score Measures word recall & precision of KantanMT engines Distributions Provides distribution of F-Measure scores across all reference translations Kantan Insight™ Holistic analysis of score and advice on how to improve this for KantanMT engines Detailed Analysis Segment level F-Measure analysis to help SMT Developers improve training material
  • 12. Kantan BuildAnalytics™ Detailed Reports for: F-Measure, BLEU and TER
  • 13. Kantan BuildAnalytics™ Gap Analysis – quickest way of improving fluency
  • 14. Kantan BuildAnalytics™ Training Rejects Report – Improve training data rapidly
  • 15. Kantan BuildAnalytics™ Timeline – Tracks history of KantanMT engines
  • 16. Kantan BuildAnalytics™ - Rapid MT Customisation
  • 17. bmmt GmbH and KantanMT: The Real-World Use of Machine Translation Maxim Khalilov Technical Lead bmmt GmbH maxim.khalilov@machine-translation.eu KantanMT webinar April 10, 2014
  • 18. MT in industry: context and rationale The combination of these two technologies, well-established TM and cutting-edge MT, plus post-editing allows the creation of a high-quality translation that reads just as well as a “classically” produced translation.
  • 19. MT in industry: what about cost? The cost structure changes when machine translation is integrated into the translation pipeline. When machine translation is adopted, the data preparation and quality assurance (editing) costs rise whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is reduced dramatically as illustrated.
  • 20. MT case study  Customer: big German machine manufacturer  Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.  Settings: the files were processed through Trados Studio 2011.  Implementation: KantanMT  Description: Roughly 7,000 words came from TM as high matches. The remainder went through MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the same level of quality as in an all-human translation.  Training material: Our customer had not worked in this language combination before, so there was no TM to go on. But we knew that the English authors based their work on material that the customer had previously translated from German into English. Thus we reversed the language direction of the TM and trained a customer-specific engine with this TM.  Results: As a result, 44,000 words were post-edited to a final quality level that the customer was very happy with.  Cost savings > 30%.
  • 21. MT: benefits of KantanMT solution  Fully automated system training  One-click system customization  Automatic data pre-processing  Fully automated translation  Automatic pre- and post-processing  Quality assessment  KantanWatch  Gap Analysis  Reject Report  No worry about maintenance and infrastructure
  • 22. MT: benefits of KantanMT solution  Transparent file format conversion  Training material conversion: TM conversion, monolingual material  Documents to translate: TMS format into MTable format  SDLXliff  Smooth terminology integration  Consistent terminology  Tag handling and mark-up transfer Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8 SWord 9</g> Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g id="16481">Number</g>
  • 23. bmmt GmbH  Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology solutions  Three operations centers in Germany: Munich, Berlin and Stuttgart  bmmt GmbH heavily relies on KantanMT services from 2013  Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT  Types of documents: workshop texts, product catalogues & other highly repetitive information documents  Primary source language: German  Integration: SDL Trados, SDL WorldServer and others  Find more: www.machine-translation.eu
  • 24. Berlin Alt-Moabit 92 10559 Berlin Phone: +49 30-3117505-15 Fax: +49 30-3117505-20 Munich Bernhard-Wicki-Straße 5 80636 Munich Phone: +49 89 2000037-17 Fax: +49 89 2000037-11 Stuttgart Ruppmannstraße 33b 70565 Stuttgart Phone: +49 711 16646-66 Fax: +49 711 16646-50 bmmt GmbH info@machine-translation.eu Thank you
  • 25. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar Tony O’Dowd, tonyod@kantanmt.com Maxim Khalilov, maxim.khalilov@machine-translation.eu Speakers