SlideShare une entreprise Scribd logo
1  sur  25
Télécharger pour lire hors ligne
January 15th, 2015
Throughputs: What is
Behind Productive
Post-Editing?
 What circumstances or variables most
reliably facilitate good-quality, highly
productive post-editing?
 Do conditions and parameters outside the
post-editor’s control facilitate or hamper his
or her success?
Welocalize Language Tools Team
 Implementation and management of
Machine Translation programs
 Analysis and research
The Database
Data gathered from 2013 to date
Objective:
Establish correlations between 3 evaluation approaches to:
- draw conclusions on predicting productivity gains in advance
- see how & when to use the different metrics best
Contents:
- Content Type
- Language Pair (English into XX)
- MT engine provider & owner (i.e. who owns training & maintenance)
- Metrics (BLEU & PE Distance, Adequacy & Fluency, Productivity deltas)
- MT error analysis
- Final QA scores
- Level of experience of resource doing productivity test
Throughputs and productivity study is carried out as part of a wider study that aims to gain
understanding and insight in Machine Translation data with the goal of making educated
business decisions for the future.
37 locales in total, with
varying amounts of
available data
11 different MT systems (SMT / Hybrid)
Marketing
Patents
Support
Tech. Doc.
UA
other
UI
The Database
Data used
Throughputs
The setup
 The throughput data used in this presentation is a by-product
of Welocalize’s productivity tests
 Throughputs per hour
 Translation from scratch: No translation memory was leveraged
for the translation part of the test
 185 samples
 13 different accounts
 6 generic categories
 11 different machine translation engines (statistical and hybrid)
All of the engines have been customized
 Linguists: At least three years of experience on the specific
content type + previous exposure to post-editing
Translation versus Post-editing
The data
Note: All resources that have taken part in productivity tests are represented
in these two graphics.
These graphs include all languages, content types and MT engines used
during the tests.
Translation versus Post-editing
When we join the data from the previous graphics together we
note that not all the resources improve equally (or at all) when
changing activities from translation to post-editing.
Comparison between Translation
and Post-editing Throughputs
The difference
Productivity Tests
The iOmegaT environment
 Post-Editing versus Human Translation
 Tests performed to validate predictive findings
 Tool: iOmegaT, instrumented version of open source CAT tool OmegaT, developed in
collaboration with John Moran (CNGL)
 iOmegaT tracks time spent editing segments, editing behaviour & activity
 Closely mimics translators’ usual work environment: integrated glossary, concordance,
etc. and compatible with 3rd party tools for language quality checks.
 Translators can visit a segment several times, if they change their mind later during
translation, or need to implement global changes, etc.
 Test sets consist of a mix of MTed segments to post-edit and no matches that need to be
translated from scratch
 Usual scope is 8h of translation / post-editing
 Provides productivity delta between post-edited and translated words
Note: high throughputs need to be interpreted within the context of this test environment
Evaluation Data
A sample
Productivity Results Human Evaluation LQA Automatic Scores
MT
Engine
Locale Productivity
Delta (%)
Adequacy
Score
Fluency
Score
LQA BLEU NIST TER Meteor Precision Recall GTM PE
Distance
MS Hub pt-BR 73.8% 3.65 3.42 99.04% 65.74 9.30 21.14 73.95 81.04 80.19 69.07 26.00%
MS Hub de-DE 22.9% 3.88 3.48 99.75% 40.76 6.69 46.30 55.45 70.03 68.13 48.96 34.23%
Data from a sample evaluation – example of evaluation criteria
 The productivity delta represents the percentage increase from the
average HT throughput when post-editing
 Good correlation between productivity results and automatic scores
 In spite of the 20 point BLEU/METEOR/GTM difference in the engines,
there are productivity gains in both
 The results reflect the differences between language groups well
Throughputs
The trend
Trend1: higher translation throughputs generally correlate with lower productivity
delta, as corresponding post-editing throughputs might not be significantly higher
 Previous post-editing studies have also highlighted this phenomenon (Gerberof,
Plitt & Masselot)
Average productivity delta
23.14%
Who benefits from Post-editing?
Analysis by Language and Content type
Languages selected for this analysis:
Content Types: Marketing, Patents, Support, Technical Documentation, UI
Brazilian Portuguese
French
German
Italian
Japanese
Latin-American Spanish
Polish
Russian
Simplified Chinese
Spanish
Language complexity grouping
for MT PE
MT PE Reference
table
Who benefits from Post-editing?
Romance Languages
ES_LA
IT
ES
FR
PT_BR
38%
32%
29%
26%
23%
 Romance
languages are the
group that usually
renders highest
productivity gains.
 Within Romance
languages, Latin
American Spanish
and Brazilian
Portuguese are often
the ones with the
highest productivity
gains from the point
of view of PE.
Who benefits from Post-editing?
German and Slavic Languages
 German and Slavic
are considered
medium complexity
languages
 Availability of
training resources
and post-editor’s
make these
languages a good fit
for MT PE
14%
15%
16%
17%
RU
PL
17%
15%
Who benefits from Post-editing?
Asian Languages
 Asian languages are
considered complex
from the point of
view of MT.
 Productivity gains
depend on
translator’s method
of working and their
expertise in PE.
Simplified Chinese
can render high
productivity gains, as
shown in the graph.
0%
5%
10%
15%
14%
JP
Average Productivity delta - ZH CN
6%
Content types
Marketing
Average Productivity delta - ZH CN
6%
 Marketing remains a challenging content type for post-editing due to
high quality expectations and free style. However, productivity gains can
still be realised with well-trained MT systems and content that is not
transcreation.
Content types
Technical Documentation
 Technical Documentation is a good content type for MT PE.
 Characteristics: constrained, often structured language; human-quality
translation expectations but without added style and voice requirements.
Content types
Support
 Support: Knowledge-base content, technical blogs, procedural articles,
Q&A, etc.
 More relaxed quality expectations make this type of content very
suitable for Machine Translation.
 In some instances this content is suitable for raw MT publishing when a
customized engine is used.
Content types
Other content types
14%
14%
15%
15%
16%
16%
17%
17%
18%
18%
Patents UI
15%
18%
User Generated Content
• Highly productive due to low number of touch points during post-editing
• Examples: travel and consumer reviews, blogs
• Quality expectations are very relaxed
• Only accuracy with original meaning is requested
• No terminology checks or cosmetic changes are necessary
• Very high expected throughputs: from 500 to 1,000 per hour
• Also suitable for raw MT publishing when a customized engine is used
Quality
Misconceptions
The idea that high throughputs affect MT quality is inaccurate.
Sometimes linguistic issues appear more frequently in translated segments
and in fuzzy-matches than in post-edited segments.
Examples of good
quality and high
throughputs
Language MT
(words/hr)
LQA
Percentage
ja_JP 441 99.89%
es_LA 492 99.60%
pl_PL 644 99.91%
sk_SK 769 99.50%
hu_HU 847 99.73%
Post-editing
Other factors
Years experience
In a recent survey…
 Most respondents have more experience with translation than with
post-editing
 The overall correlation between translation experience and post-
editing experience is “strong”
However, looking at correlations by locale
German: very strong
French: weak
Japanese: weak
PTBR: strong
Hungarian: weak
 This suggests that for German and Brazilian Portuguese only, the
overall experience as professional translator (whether junior or
senior) gives us insights into how much post-editing experience to
expect. For the other 3 locales, profiles are more varied
Post-editing
Other factors
- Experience working on certain content type: most linguists used
for productivity tests are very experienced translating / post-
editing the tested content type
- No clear trend with regard to background, assuming translation
background like freelance/staff translator, content type
experience, etc.
- No clear trend in relation to working environment (office / at
home, etc.)
Text input methods:
 French and German translators seem to make more use of CAT
tool shortcuts
 Japanese requires the use of Input Method Editors and less use
of shortcuts
Final conclusions
• Based on our findings, Romance languages are the best performers
on MT PE
• All content types are suitable for MT PE, with the exception of
Transcreation; Technical Documentation and Technical Support are
two of the most suitable (apart from UGC).
• Not all translators improve at the same pace when moving to post-
editing
• Productivity increases most in individuals with average translation
throughputs
• Knowledge of the subject matter helps achieving high throughputs
• It is more difficult to foresee post-editing effort than to asses the
quality of raw MT. The human effort is still the most variable aspect.
• There is no quality degradation in MT PE
Questions and answers
Any questions?
Laura Casanellas, WL Language Tools
laura.casanellas@welocalize.com

Contenu connexe

Tendances

Amta 2012-federico (1)
Amta 2012-federico (1)Amta 2012-federico (1)
Amta 2012-federico (1)FabiolaPanetti
 
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Jose Luis Bonilla Sánchez
 
Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Arle Lommel
 
2.2. language evaluation criteria
2.2. language evaluation criteria2.2. language evaluation criteria
2.2. language evaluation criteriaannahallare_
 
Translation assessment
Translation assessmentTranslation assessment
Translation assessmentapril aulia
 
Principles of programming
Principles of programmingPrinciples of programming
Principles of programmingRob Paok
 
ML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queriesML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queriesVarun Nathan
 
introduction to programming
introduction to programmingintroduction to programming
introduction to programmingGaea Bonita
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for TranslationRIILP
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargWelocalize
 
Coding principles
Coding principles Coding principles
Coding principles DevAdnani
 
Introduction to programming principles languages
Introduction to programming principles languagesIntroduction to programming principles languages
Introduction to programming principles languagesFrankie Jones
 
New Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation TechnologyNew Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation Technologykantanmt
 
Cmp2412 programming principles
Cmp2412 programming principlesCmp2412 programming principles
Cmp2412 programming principlesNIKANOR THOMAS
 
Principles of programming languages. Detail notes
Principles of programming languages. Detail notesPrinciples of programming languages. Detail notes
Principles of programming languages. Detail notesVIKAS SINGH BHADOURIA
 
Ch1 language design issue
Ch1 language design issueCh1 language design issue
Ch1 language design issueJigisha Pandya
 

Tendances (20)

Amta 2012-federico (1)
Amta 2012-federico (1)Amta 2012-federico (1)
Amta 2012-federico (1)
 
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
 
Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)
 
2.2. language evaluation criteria
2.2. language evaluation criteria2.2. language evaluation criteria
2.2. language evaluation criteria
 
Translation assessment
Translation assessmentTranslation assessment
Translation assessment
 
Principles of programming
Principles of programmingPrinciples of programming
Principles of programming
 
ML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queriesML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queries
 
introduction to programming
introduction to programmingintroduction to programming
introduction to programming
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
 
5. bleu
5. bleu5. bleu
5. bleu
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
 
C aptitude book
C aptitude bookC aptitude book
C aptitude book
 
Programming and problem solving with c++, 3rd edition
Programming and problem solving with c++, 3rd editionProgramming and problem solving with c++, 3rd edition
Programming and problem solving with c++, 3rd edition
 
Coding principles
Coding principles Coding principles
Coding principles
 
Introduction to programming principles languages
Introduction to programming principles languagesIntroduction to programming principles languages
Introduction to programming principles languages
 
Cs111 ch01 v4
Cs111 ch01 v4Cs111 ch01 v4
Cs111 ch01 v4
 
New Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation TechnologyNew Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation Technology
 
Cmp2412 programming principles
Cmp2412 programming principlesCmp2412 programming principles
Cmp2412 programming principles
 
Principles of programming languages. Detail notes
Principles of programming languages. Detail notesPrinciples of programming languages. Detail notes
Principles of programming languages. Detail notes
 
Ch1 language design issue
Ch1 language design issueCh1 language design issue
Ch1 language design issue
 

En vedette

EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015Welocalize
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014Welocalize
 
Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Welocalize
 
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA... Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...Welocalize
 
Better translations through automated source and post edit analysis
Better translations through automated source and post edit analysisBetter translations through automated source and post edit analysis
Better translations through automated source and post edit analysisWelocalize
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionWelocalize
 

En vedette (6)

EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015EAMT Presentation by Welocalize Olga Beregovaya May 2015
EAMT Presentation by Welocalize Olga Beregovaya May 2015
 
MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014MT and Post-Editing User-Generated Content AMTA 2014
MT and Post-Editing User-Generated Content AMTA 2014
 
Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014Tools-Driven Content Curation & Engine Training ATMA 2014
Tools-Driven Content Curation & Engine Training ATMA 2014
 
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA... Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
Enterprise MT Content Drift: Challenges, Impacts and Advanced Solutions AMTA...
 
Better translations through automated source and post edit analysis
Better translations through automated source and post edit analysisBetter translations through automated source and post edit analysis
Better translations through automated source and post edit analysis
 
MT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to ProductionMT Quality Evaluations: From Test Environment to Production
MT Quality Evaluations: From Test Environment to Production
 

Similaire à Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas

Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...SDL
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyIconic Translation Machines
 
Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyIconic Translation Machines
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolometauyou
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLoriThicke
 
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...ABBYY Language Serivces
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16kantanmt
 
Work in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper WritingWork in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper WritingManuel Castro
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationPoulomi Choudhury
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)TAUS - The Language Data Network
 
Workshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platformWorkshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platformtauyou
 
TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...
TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...
TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...TAUS - The Language Data Network
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Loctimize GmbH
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?tauyou
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translationbdonaldson
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...Welocalize
 
Presentation at CEF-EU-Luxembourg
Presentation at CEF-EU-LuxembourgPresentation at CEF-EU-Luxembourg
Presentation at CEF-EU-LuxembourgManuel Herranz
 
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSSeeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSIconic Translation Machines
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Sajan
 

Similaire à Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas (20)

Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 
Improving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case StudyImproving Translator Productivity with MT: A Patent Translation Case Study
Improving Translator Productivity with MT: A Patent Translation Case Study
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
Work in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper WritingWork in progress: ChatGPT as an Assistant in Paper Writing
Work in progress: ChatGPT as an Assistant in Paper Writing
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive Translation
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
 
Workshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platformWorkshop on the tauyou machine translation platform
Workshop on the tauyou machine translation platform
 
TAUS Evaluating Post-Editor Performance Guidelines
TAUS Evaluating Post-Editor Performance GuidelinesTAUS Evaluating Post-Editor Performance Guidelines
TAUS Evaluating Post-Editor Performance Guidelines
 
TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...
TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...
TAUS Roundtable Moscow, CAT or TMS Implementation-Calculation of the Number o...
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translation
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
Presentation at CEF-EU-Luxembourg
Presentation at CEF-EU-LuxembourgPresentation at CEF-EU-Luxembourg
Presentation at CEF-EU-Luxembourg
 
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSSeeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies
 

Plus de Welocalize

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Welocalize
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesWelocalize
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Welocalize
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeWelocalize
 
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...Welocalize
 
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Welocalize
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaWelocalize
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology Welocalize
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Welocalize
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013Welocalize
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingWelocalize
 

Plus de Welocalize (13)

Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?Automating the Localization Workflow. What Works?
Automating the Localization Workflow. What Works?
 
How Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT EnginesHow Much Cake to Eat: The Case for Targeted MT Engines
How Much Cake to Eat: The Case for Targeted MT Engines
 
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
Localizing for Travel: Diverse Solutions for Diverse Needs by Laura Casanell...
 
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalizeContent Marketing World 2014 Language Fun Fact Challenge by welocalize
Content Marketing World 2014 Language Fun Fact Challenge by welocalize
 
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
Welocalize EAMT 2014 Presentation Assumptions, Expectations and Outliers in P...
 
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
Welocalize Cisco CNGL Partnership Shared at Localization World Dublin 2014
 
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
TAUS Quality Summit Dublin Welocalize Presentation by Olga Beregovaya and Len...
 
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
Rating Evaluation Methods through Correlation MTE 2014 Workshop May 2014
 
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga BeregovayaBeyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
Beyond Disruption: Make Way for Return on Content by Welocalize Olga Beregovaya
 
2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology 2013 CHAT tcworld tekom Welocalize Teaminology
2013 CHAT tcworld tekom Welocalize Teaminology
 
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
Overcoming “Old Fears” in the “New Marketing” World by Informatica and Weloca...
 
An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013An MT Journey Intuit and Welocalize Localization World 2013
An MT Journey Intuit and Welocalize Localization World 2013
 
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-EditingSafaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
Safaba Welocalize MT Summit 2013 Analyzing MT Utility and Post-Editing
 

Dernier

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Dernier (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas

  • 1. January 15th, 2015 Throughputs: What is Behind Productive Post-Editing?
  • 2.  What circumstances or variables most reliably facilitate good-quality, highly productive post-editing?  Do conditions and parameters outside the post-editor’s control facilitate or hamper his or her success?
  • 3. Welocalize Language Tools Team  Implementation and management of Machine Translation programs  Analysis and research
  • 4. The Database Data gathered from 2013 to date Objective: Establish correlations between 3 evaluation approaches to: - draw conclusions on predicting productivity gains in advance - see how & when to use the different metrics best Contents: - Content Type - Language Pair (English into XX) - MT engine provider & owner (i.e. who owns training & maintenance) - Metrics (BLEU & PE Distance, Adequacy & Fluency, Productivity deltas) - MT error analysis - Final QA scores - Level of experience of resource doing productivity test Throughputs and productivity study is carried out as part of a wider study that aims to gain understanding and insight in Machine Translation data with the goal of making educated business decisions for the future.
  • 5. 37 locales in total, with varying amounts of available data 11 different MT systems (SMT / Hybrid) Marketing Patents Support Tech. Doc. UA other UI The Database Data used
  • 6. Throughputs The setup  The throughput data used in this presentation is a by-product of Welocalize’s productivity tests  Throughputs per hour  Translation from scratch: No translation memory was leveraged for the translation part of the test  185 samples  13 different accounts  6 generic categories  11 different machine translation engines (statistical and hybrid) All of the engines have been customized  Linguists: At least three years of experience on the specific content type + previous exposure to post-editing
  • 7. Translation versus Post-editing The data Note: All resources that have taken part in productivity tests are represented in these two graphics. These graphs include all languages, content types and MT engines used during the tests.
  • 8. Translation versus Post-editing When we join the data from the previous graphics together we note that not all the resources improve equally (or at all) when changing activities from translation to post-editing. Comparison between Translation and Post-editing Throughputs The difference
  • 9. Productivity Tests The iOmegaT environment  Post-Editing versus Human Translation  Tests performed to validate predictive findings  Tool: iOmegaT, instrumented version of open source CAT tool OmegaT, developed in collaboration with John Moran (CNGL)  iOmegaT tracks time spent editing segments, editing behaviour & activity  Closely mimics translators’ usual work environment: integrated glossary, concordance, etc. and compatible with 3rd party tools for language quality checks.  Translators can visit a segment several times, if they change their mind later during translation, or need to implement global changes, etc.  Test sets consist of a mix of MTed segments to post-edit and no matches that need to be translated from scratch  Usual scope is 8h of translation / post-editing  Provides productivity delta between post-edited and translated words Note: high throughputs need to be interpreted within the context of this test environment
  • 10. Evaluation Data A sample Productivity Results Human Evaluation LQA Automatic Scores MT Engine Locale Productivity Delta (%) Adequacy Score Fluency Score LQA BLEU NIST TER Meteor Precision Recall GTM PE Distance MS Hub pt-BR 73.8% 3.65 3.42 99.04% 65.74 9.30 21.14 73.95 81.04 80.19 69.07 26.00% MS Hub de-DE 22.9% 3.88 3.48 99.75% 40.76 6.69 46.30 55.45 70.03 68.13 48.96 34.23% Data from a sample evaluation – example of evaluation criteria  The productivity delta represents the percentage increase from the average HT throughput when post-editing  Good correlation between productivity results and automatic scores  In spite of the 20 point BLEU/METEOR/GTM difference in the engines, there are productivity gains in both  The results reflect the differences between language groups well
  • 11. Throughputs The trend Trend1: higher translation throughputs generally correlate with lower productivity delta, as corresponding post-editing throughputs might not be significantly higher  Previous post-editing studies have also highlighted this phenomenon (Gerberof, Plitt & Masselot) Average productivity delta 23.14%
  • 12. Who benefits from Post-editing? Analysis by Language and Content type Languages selected for this analysis: Content Types: Marketing, Patents, Support, Technical Documentation, UI Brazilian Portuguese French German Italian Japanese Latin-American Spanish Polish Russian Simplified Chinese Spanish
  • 13. Language complexity grouping for MT PE MT PE Reference table
  • 14. Who benefits from Post-editing? Romance Languages ES_LA IT ES FR PT_BR 38% 32% 29% 26% 23%  Romance languages are the group that usually renders highest productivity gains.  Within Romance languages, Latin American Spanish and Brazilian Portuguese are often the ones with the highest productivity gains from the point of view of PE.
  • 15. Who benefits from Post-editing? German and Slavic Languages  German and Slavic are considered medium complexity languages  Availability of training resources and post-editor’s make these languages a good fit for MT PE 14% 15% 16% 17% RU PL 17% 15%
  • 16. Who benefits from Post-editing? Asian Languages  Asian languages are considered complex from the point of view of MT.  Productivity gains depend on translator’s method of working and their expertise in PE. Simplified Chinese can render high productivity gains, as shown in the graph. 0% 5% 10% 15% 14% JP Average Productivity delta - ZH CN 6%
  • 17. Content types Marketing Average Productivity delta - ZH CN 6%  Marketing remains a challenging content type for post-editing due to high quality expectations and free style. However, productivity gains can still be realised with well-trained MT systems and content that is not transcreation.
  • 18. Content types Technical Documentation  Technical Documentation is a good content type for MT PE.  Characteristics: constrained, often structured language; human-quality translation expectations but without added style and voice requirements.
  • 19. Content types Support  Support: Knowledge-base content, technical blogs, procedural articles, Q&A, etc.  More relaxed quality expectations make this type of content very suitable for Machine Translation.  In some instances this content is suitable for raw MT publishing when a customized engine is used.
  • 20. Content types Other content types 14% 14% 15% 15% 16% 16% 17% 17% 18% 18% Patents UI 15% 18% User Generated Content • Highly productive due to low number of touch points during post-editing • Examples: travel and consumer reviews, blogs • Quality expectations are very relaxed • Only accuracy with original meaning is requested • No terminology checks or cosmetic changes are necessary • Very high expected throughputs: from 500 to 1,000 per hour • Also suitable for raw MT publishing when a customized engine is used
  • 21. Quality Misconceptions The idea that high throughputs affect MT quality is inaccurate. Sometimes linguistic issues appear more frequently in translated segments and in fuzzy-matches than in post-edited segments. Examples of good quality and high throughputs Language MT (words/hr) LQA Percentage ja_JP 441 99.89% es_LA 492 99.60% pl_PL 644 99.91% sk_SK 769 99.50% hu_HU 847 99.73%
  • 22. Post-editing Other factors Years experience In a recent survey…  Most respondents have more experience with translation than with post-editing  The overall correlation between translation experience and post- editing experience is “strong” However, looking at correlations by locale German: very strong French: weak Japanese: weak PTBR: strong Hungarian: weak  This suggests that for German and Brazilian Portuguese only, the overall experience as professional translator (whether junior or senior) gives us insights into how much post-editing experience to expect. For the other 3 locales, profiles are more varied
  • 23. Post-editing Other factors - Experience working on certain content type: most linguists used for productivity tests are very experienced translating / post- editing the tested content type - No clear trend with regard to background, assuming translation background like freelance/staff translator, content type experience, etc. - No clear trend in relation to working environment (office / at home, etc.) Text input methods:  French and German translators seem to make more use of CAT tool shortcuts  Japanese requires the use of Input Method Editors and less use of shortcuts
  • 24. Final conclusions • Based on our findings, Romance languages are the best performers on MT PE • All content types are suitable for MT PE, with the exception of Transcreation; Technical Documentation and Technical Support are two of the most suitable (apart from UGC). • Not all translators improve at the same pace when moving to post- editing • Productivity increases most in individuals with average translation throughputs • Knowledge of the subject matter helps achieving high throughputs • It is more difficult to foresee post-editing effort than to asses the quality of raw MT. The human effort is still the most variable aspect. • There is no quality degradation in MT PE
  • 25. Questions and answers Any questions? Laura Casanellas, WL Language Tools laura.casanellas@welocalize.com