SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
Machine translation evaluation
Hermes Traducciones y Servicios Lingüísticos
MT at Hermes
2

 Pure RBMT engines with pre- and post-processing macros.
 Texts from technical domains.
 Applied-technology department has been working for over a
year in MT engines.
 Over 250,000 words post-edited with internal engines in the
last year.
 Average new word count for projects post-edited with internal
engines: 9,000 words.
Our purpose with MT evals
3

Automated metrics might help us:
 predict PE time and productivity gains;
 negotiate reasonable discounts;
 evaluate quality of engines;
 measure performance of applied-technology department;
 not depend on human-reported data.
What we hoped to find
4

 We hoped some metric would correlate with productivity gain
data provided by post-editors.
 We gathered BLEU, F-Measure, METEOR and TER
values.
 Ideally, we would end up relying on automated metrics rather
than time and productivity measurements reported by posteditors.
What we hoped to find
5

120.00

100.00

80.00

60.00

40.00

20.00

0.00
0.00

20.00

40.00

60.00

Productivity gain %

80.00

100.00

120.00
What we hoped to find
6

120.00

100.00

80.00

60.00

40.00

20.00

0.00
0.00

20.00

40.00

60.00

Productivity gain %

80.00

100.00

120.00
What we actually found: No correlation
7
100.00
90.00
80.00
70.00
60.00
BLEU

50.00

F-Measure
TER

40.00

METEOR
30.00
20.00
10.00
0.00
0.00

20.00

40.00

60.00

80.00

100.00

Productivity gain %

120.00

140.00

160.00
What we actually found: No correlation
8
100.00
90.00
80.00
70.00
60.00
BLEU

50.00

F-Measure
TER

40.00

METEOR
30.00
20.00
10.00
0.00
0.00

20.00

40.00

60.00

80.00

100.00

Productivity gain %

120.00

140.00

160.00
Reasons for the variability
9

 Different CAT environments (Trados Studio, memoQ,
Idiom, TagEditor, etc.).
 Different engines (per domain, per client, etc.).
 Different clients, different needs.
 Different post-editors.
 Or, if same post-editor, different post-editing skills over time.

 Different word volumes.
 Specific productivity or consistency-enhancement
processing can affect metrics negatively.
Productivity-enhancement example
10

 Source: Add events as described in Adding Events to a Model.
 PE: Agregue los eventos como se describe en Adición de eventos a un
modelo.
 Raw 1: Agregue los eventos como se describe en la adición de los eventos a
un modelo.
 Raw 2: Agregue los eventos como se describe en Adding Events to a Model.
 Scores:
Raw 1 Raw 2
 BLEU
 TER

68,59
17,65

53,33
29,41

Metrics for Raw 1 are significantly
better, but Raw 2 is faster to post-edit
thanks to automatic terminology
insertion tools (such as Xbench).
Human evaluation
11

 Adequacy: How much of the meaning expressed in the goldstandard translation or the source is also expressed in the target
translation?





4. Everything
3. Most
2. Little
1. None

 Fluency: To what extent is a target side translation grammatically
well informed, without spelling errors and experienced as using
natural/intuitive language by a native speaker?





4. Flawless
3. Good
2. Dis-fluent
1. Incomprehensible
Source: TAUS MT evaluation guidelines
https://evaluation.taus.net/resources/adequacy-fluency-guidelines
Conclusions
12

 We combine automated metrics with time/productivity data reported
by post-editor for final evaluation of internal MT performance.
 Poor post-editing skills or any project-specific contingency can be
counter-balanced with good automated metrics.
 We look for qualitative information in automated metrics, not
quantitative.
 BLEU values of 65 and 70 for two different engines tell us both
are good engines, not that one will render 5% better results than
the other.

Contenu connexe

Tendances

Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Arle Lommel
 
Defining Translation Quality in ASTM
Defining Translation Quality in ASTMDefining Translation Quality in ASTM
Defining Translation Quality in ASTMSerge Gladkoff
 
TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...TAUS - The Language Data Network
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedSDL
 
Top Trans Survey Translation Issues
Top Trans Survey Translation IssuesTop Trans Survey Translation Issues
Top Trans Survey Translation IssuesRaya Wasser
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationIconic Translation Machines
 
How Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical TranslationHow Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical TranslationTennycut
 
Technical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_enTechnical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_enVyacheslav Guzovsky
 
Keys to successful technical translation
Keys to successful technical translationKeys to successful technical translation
Keys to successful technical translationTrue Language
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translationbdonaldson
 
Language translator internship report
Language translator internship reportLanguage translator internship report
Language translator internship reportSumitSumit26
 
Technical translation (1)
Technical translation (1)Technical translation (1)
Technical translation (1)Brian Cannon
 
Panel: Translation Quality Challenges
Panel: Translation Quality ChallengesPanel: Translation Quality Challenges
Panel: Translation Quality ChallengesSDL
 
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)RIILP
 

Tendances (20)

Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)
 
Defining Translation Quality in ASTM
Defining Translation Quality in ASTMDefining Translation Quality in ASTM
Defining Translation Quality in ASTM
 
TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons Learned
 
Top Trans Survey Translation Issues
Top Trans Survey Translation IssuesTop Trans Survey Translation Issues
Top Trans Survey Translation Issues
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine Translation
 
How Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical TranslationHow Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical Translation
 
Technical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_enTechnical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_en
 
Keys to successful technical translation
Keys to successful technical translationKeys to successful technical translation
Keys to successful technical translation
 
Technical Translation
Technical TranslationTechnical Translation
Technical Translation
 
Back translation explained: what we do and what you get
Back translation explained: what we do and what you getBack translation explained: what we do and what you get
Back translation explained: what we do and what you get
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translation
 
Steps in translation process
Steps in translation processSteps in translation process
Steps in translation process
 
Language translator internship report
Language translator internship reportLanguage translator internship report
Language translator internship report
 
The 3 types of translation review – and when to use them
The 3 types of translation review – and when to use themThe 3 types of translation review – and when to use them
The 3 types of translation review – and when to use them
 
MT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, LingosailMT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, Lingosail
 
Technical translation (1)
Technical translation (1)Technical translation (1)
Technical translation (1)
 
Panel: Translation Quality Challenges
Panel: Translation Quality ChallengesPanel: Translation Quality Challenges
Panel: Translation Quality Challenges
 
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
 
Insights in the MT Market, by Jaap van der Meer, TAUS
Insights in the MT Market, by Jaap van der Meer, TAUSInsights in the MT Market, by Jaap van der Meer, TAUS
Insights in the MT Market, by Jaap van der Meer, TAUS
 

En vedette

3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT IntroductionRIILP
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for TranslationRIILP
 
1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner IntroductionsRIILP
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memoriesRIILP
 
8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for TranslationRIILP
 
17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2RIILP
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1RIILP
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for TranslationRIILP
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine TranslationRIILP
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT IntroductionRIILP
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...RIILP
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine TranslationRIILP
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...RIILP
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for TranslationRIILP
 

En vedette (14)

3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
 
1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
 
8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation
 
17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
 

Similaire à Machine translation evaluation metrics provide no correlation with post-editor productivity gains

Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_saRobert Martin
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...SDL
 
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSSeeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSIconic Translation Machines
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
 
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Jose Luis Bonilla Sánchez
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargWelocalize
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS - The Language Data Network
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)Konstantin Savenkov
 
Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones RIILP
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...Welocalize
 
Amta 2012-federico (1)
Amta 2012-federico (1)Amta 2012-federico (1)
Amta 2012-federico (1)FabiolaPanetti
 
Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2Think Latin America
 
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...TAUS - The Language Data Network
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyIconic Translation Machines
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...Lifeng (Aaron) Han
 

Similaire à Machine translation evaluation metrics provide no correlation with post-editor productivity gains (20)

Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSSeeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
 
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
 
TAUS QE Summit 2017 eBay EN-DE MT Pilot
TAUS QE Summit 2017   eBay EN-DE MT PilotTAUS QE Summit 2017   eBay EN-DE MT Pilot
TAUS QE Summit 2017 eBay EN-DE MT Pilot
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones
 
Ch26
Ch26Ch26
Ch26
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
Amta 2012-federico (1)
Amta 2012-federico (1)Amta 2012-federico (1)
Amta 2012-federico (1)
 
SE_Unit 2.pptx
SE_Unit 2.pptxSE_Unit 2.pptx
SE_Unit 2.pptx
 
Quality is in the Eye of the Beholder, by Eva Klaudinyova
Quality is in the Eye of the Beholder, by Eva KlaudinyovaQuality is in the Eye of the Beholder, by Eva Klaudinyova
Quality is in the Eye of the Beholder, by Eva Klaudinyova
 
Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2
 
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 
Software tools
Software toolsSoftware tools
Software tools
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 

Plus de RIILP

Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD RIILP
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic RIILP
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones RIILP
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO RIILP
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic RIILP
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT RIILP
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARRIILP
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU RIILP
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMARIILP
 
Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD  Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD RIILP
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW RIILP
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA RIILP
 
Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU RIILP
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARRIILP
 
Sandra de luca - Acclaro
Sandra de luca - AcclaroSandra de luca - Acclaro
Sandra de luca - AcclaroRIILP
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015RIILP
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015RIILP
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015RIILP
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015RIILP
 
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015RIILP
 

Plus de RIILP (20)

Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAAR
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMA
 
Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD  Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA
 
Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAAR
 
Sandra de luca - Acclaro
Sandra de luca - AcclaroSandra de luca - Acclaro
Sandra de luca - Acclaro
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
 
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
 

Dernier

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 

Dernier (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

Machine translation evaluation metrics provide no correlation with post-editor productivity gains

  • 1. Machine translation evaluation Hermes Traducciones y Servicios Lingüísticos
  • 2. MT at Hermes 2  Pure RBMT engines with pre- and post-processing macros.  Texts from technical domains.  Applied-technology department has been working for over a year in MT engines.  Over 250,000 words post-edited with internal engines in the last year.  Average new word count for projects post-edited with internal engines: 9,000 words.
  • 3. Our purpose with MT evals 3 Automated metrics might help us:  predict PE time and productivity gains;  negotiate reasonable discounts;  evaluate quality of engines;  measure performance of applied-technology department;  not depend on human-reported data.
  • 4. What we hoped to find 4  We hoped some metric would correlate with productivity gain data provided by post-editors.  We gathered BLEU, F-Measure, METEOR and TER values.  Ideally, we would end up relying on automated metrics rather than time and productivity measurements reported by posteditors.
  • 5. What we hoped to find 5 120.00 100.00 80.00 60.00 40.00 20.00 0.00 0.00 20.00 40.00 60.00 Productivity gain % 80.00 100.00 120.00
  • 6. What we hoped to find 6 120.00 100.00 80.00 60.00 40.00 20.00 0.00 0.00 20.00 40.00 60.00 Productivity gain % 80.00 100.00 120.00
  • 7. What we actually found: No correlation 7 100.00 90.00 80.00 70.00 60.00 BLEU 50.00 F-Measure TER 40.00 METEOR 30.00 20.00 10.00 0.00 0.00 20.00 40.00 60.00 80.00 100.00 Productivity gain % 120.00 140.00 160.00
  • 8. What we actually found: No correlation 8 100.00 90.00 80.00 70.00 60.00 BLEU 50.00 F-Measure TER 40.00 METEOR 30.00 20.00 10.00 0.00 0.00 20.00 40.00 60.00 80.00 100.00 Productivity gain % 120.00 140.00 160.00
  • 9. Reasons for the variability 9  Different CAT environments (Trados Studio, memoQ, Idiom, TagEditor, etc.).  Different engines (per domain, per client, etc.).  Different clients, different needs.  Different post-editors.  Or, if same post-editor, different post-editing skills over time.  Different word volumes.  Specific productivity or consistency-enhancement processing can affect metrics negatively.
  • 10. Productivity-enhancement example 10  Source: Add events as described in Adding Events to a Model.  PE: Agregue los eventos como se describe en Adición de eventos a un modelo.  Raw 1: Agregue los eventos como se describe en la adición de los eventos a un modelo.  Raw 2: Agregue los eventos como se describe en Adding Events to a Model.  Scores: Raw 1 Raw 2  BLEU  TER 68,59 17,65 53,33 29,41 Metrics for Raw 1 are significantly better, but Raw 2 is faster to post-edit thanks to automatic terminology insertion tools (such as Xbench).
  • 11. Human evaluation 11  Adequacy: How much of the meaning expressed in the goldstandard translation or the source is also expressed in the target translation?     4. Everything 3. Most 2. Little 1. None  Fluency: To what extent is a target side translation grammatically well informed, without spelling errors and experienced as using natural/intuitive language by a native speaker?     4. Flawless 3. Good 2. Dis-fluent 1. Incomprehensible Source: TAUS MT evaluation guidelines https://evaluation.taus.net/resources/adequacy-fluency-guidelines
  • 12. Conclusions 12  We combine automated metrics with time/productivity data reported by post-editor for final evaluation of internal MT performance.  Poor post-editing skills or any project-specific contingency can be counter-balanced with good automated metrics.  We look for qualitative information in automated metrics, not quantitative.  BLEU values of 65 and 70 for two different engines tell us both are good engines, not that one will render 5% better results than the other.