SlideShare une entreprise Scribd logo
1  sur  10
No Hardware. No Software. No Hassle MT.
Machine Translation & Quality
What we aim to cover?
 The MT & Quality Relationship
 What is quality?
 Possible ways of measuring it
 Automated evaluation methods
 Who needs to measure quality
 Localisation stakeholders
 Conclusion

Machine Translation & Quality
The Quality & MT Relationship

Machine Translation & Quality
Attributes of Quality
 Language Attributes
 Adequacy



Accuracy of generated texts
Based on word recall & precision

 Fluency




Comprehensibility of texts
Readability, understandability
Based on phrase reuse and
assembly

 Task-oriented Attributes
 Productivity


Post-editing speed

 Acceptability



Fit-for-purpose measurement
Usable translations within the
context of the end user

Machine Translation & Quality
Automated Evaluations
 Many difference techniques available



All compute similarity of generated texts to reference texts
The smaller the difference => the better the quality!

NIST

Fluency

Usability

GTM

F-Measure

Productivity

TER

Adequacy

BLEU

Acceptability
METEOR

Language

Task
Machine Translation & Quality
Who needs to measure Quality?
 The Localisation Stakeholder Dilemma
 Developers of MT Engines




Automated BLEU, METEOR, F-MEASURE, TER ideal and practical
No individual measurement has absolute meaning


but points quality curve in the right direction within a domain

Machine Translation & Quality
Who needs to measure Quality?
 The Localisation Stakeholder Dilemma
 Production Teams (PMs, LEs and QEs)


Need segment measurements on quality and PE efforts



Determine tiered segment post-edit rate
Distribution of post-editing tasks based on segment quality

 Localisation Managers


Need productivity measurements to predict budget and schedule



Aka Project Segment Reports
MT Measurements need to ‘fit’ business planning and charge models

 Translators


Unfortunately, don’t get a fair deal


No segment information, just top level project
Machine Translation & Quality
F-Measure

TER

BLEU

GTM

METEOR

NIST

MT Developers

Production

The Quality & MT Relationship

Machine Translation & Quality
Conclusions
 There are many automated MT quality measurements




Mostly suitable for MT developers
Not optimal for production teams
Of no use to translators

 All rely on reference texts to compute measurements

 What’s needed?
 Segment level measurements



Drive project schedule and charge model
High correlation to human effort

 Do not rely on reference texts to compute measurements

Machine Translation & Quality

Contenu connexe

Plus de kantanmt

KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowdkantanmt
 
Get Started with KantanNeural
Get Started with KantanNeuralGet Started with KantanNeural
Get Started with KantanNeuralkantanmt
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answerkantanmt
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systemskantanmt
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translationkantanmt
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...kantanmt
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16kantanmt
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016kantanmt
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translationkantanmt
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translationkantanmt
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...kantanmt
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivitykantanmt
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up businesskantanmt
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?kantanmt
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translationkantanmt
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTkantanmt
 
Breaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerceBreaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommercekantanmt
 
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCUCloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCUkantanmt
 
How to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EURHow to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EURkantanmt
 
How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014 How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014 kantanmt
 

Plus de kantanmt (20)

KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowd
 
Get Started with KantanNeural
Get Started with KantanNeuralGet Started with KantanNeural
Get Started with KantanNeural
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answer
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translation
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translation
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translation
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivity
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up business
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translation
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMT
 
Breaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerceBreaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerce
 
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCUCloud Computing: IC4 Cloud On-Boarding Clinic, DCU
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
 
How to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EURHow to set up a high tech business in the Cloud for 2,000 EUR
How to set up a high tech business in the Cloud for 2,000 EUR
 
How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014 How Does Your MT System Measure Up? tekom/tcworld 2014
How Does Your MT System Measure Up? tekom/tcworld 2014
 

Dernier

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Dernier (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

What is Quality? A Machine Translation Perspective

  • 1. No Hardware. No Software. No Hassle MT.
  • 3. What we aim to cover?  The MT & Quality Relationship  What is quality?  Possible ways of measuring it  Automated evaluation methods  Who needs to measure quality  Localisation stakeholders  Conclusion Machine Translation & Quality
  • 4. The Quality & MT Relationship Machine Translation & Quality
  • 5. Attributes of Quality  Language Attributes  Adequacy   Accuracy of generated texts Based on word recall & precision  Fluency    Comprehensibility of texts Readability, understandability Based on phrase reuse and assembly  Task-oriented Attributes  Productivity  Post-editing speed  Acceptability   Fit-for-purpose measurement Usable translations within the context of the end user Machine Translation & Quality
  • 6. Automated Evaluations  Many difference techniques available   All compute similarity of generated texts to reference texts The smaller the difference => the better the quality! NIST Fluency Usability GTM F-Measure Productivity TER Adequacy BLEU Acceptability METEOR Language Task Machine Translation & Quality
  • 7. Who needs to measure Quality?  The Localisation Stakeholder Dilemma  Developers of MT Engines   Automated BLEU, METEOR, F-MEASURE, TER ideal and practical No individual measurement has absolute meaning  but points quality curve in the right direction within a domain Machine Translation & Quality
  • 8. Who needs to measure Quality?  The Localisation Stakeholder Dilemma  Production Teams (PMs, LEs and QEs)  Need segment measurements on quality and PE efforts   Determine tiered segment post-edit rate Distribution of post-editing tasks based on segment quality  Localisation Managers  Need productivity measurements to predict budget and schedule   Aka Project Segment Reports MT Measurements need to ‘fit’ business planning and charge models  Translators  Unfortunately, don’t get a fair deal  No segment information, just top level project Machine Translation & Quality
  • 9. F-Measure TER BLEU GTM METEOR NIST MT Developers Production The Quality & MT Relationship Machine Translation & Quality
  • 10. Conclusions  There are many automated MT quality measurements    Mostly suitable for MT developers Not optimal for production teams Of no use to translators  All rely on reference texts to compute measurements  What’s needed?  Segment level measurements   Drive project schedule and charge model High correlation to human effort  Do not rely on reference texts to compute measurements Machine Translation & Quality