The slide deck of the presentation given on June 16 at Localization World 34 in Barcelona.
To successfully run an MT platform and MT projects, a very specific skillset is needed. The right combination of MT and post-editing (PE) can help reduce turn-around times even in low-tech contexts while maximizing cost-effectiveness.
This presentation introduces to the strategies for an effective solution for translation buyers and vendors.
Read about the dos and don’ts when dealing with MT + PE in regard to improving productivity and increasing speed and ease of translation; the best setup for an operating environment based on the right project requirements and practices specifically devised; and the primary challenges posed by MT and PE, as preparing data, assessing quality of outputs, estimating the post-editing effort, vetting, selecting, instructing and compensating human resources.
9. Cobbler, stick to thy
last
Zapatero, a tus zapatos
Schuster, bleib bei deinem Leisten!
Schoenmaker, blijf bij je leest
Ne supra crepidam sutor iudicaret
20. Trust the
consultant
Reconcile goals and expectations
Outline an exploratory program
Benchmark performance
Prepare the specs
Draft a SOW
Write the RFP
Vet vendors
Prepare your data
Revise business and pricing models
Retrofit processes
27. Givens
Not all engines are created equal
Raw output can vary across systems—
and language pairs
Errors may not follow a consistent
pattern
Engine performance also varies
31. Key questions
Buyer or vendor?
Dos & Don’ts?
How do I deal with data?
How do I assess quality?
How do I hire staff?
What about post-editing?
32. Dos
Know your data
Consider training necessary
Leverage quality evaluation metrics
Define AQLs
Plan for continuous improvement
Arrange for post-editing
Devise a compensation scheme
33. Don’ts
Treat all content equally
DIY
Rely on vendors only
Mess with data
Trust one single metric
Rush
Mess with staff
Expect miracles
36. Quantity and
quality
1,000,000 words/50,000 segments
• No contiguous/inclusive domains
More data higher quality
• Good data
37. Good data
Few reliable sources
Single domain
Current data
Same encoding
No empty segments
No errors
Terminologically consistent segments
Same style
Same-length segments
41. Post-editing:
measures
Edit Time
• The time required to get a raw MT output to
the desired standard
Post-editing effort
• Percentage of edits to be applied to raw MT
output to attain the desired standard
44. Post-editing
levels
Gisting
• Volatile content
– Automatic scripts to fix mechanical/recurring errors
Light
• Continuous delivery
– Fixing capitalization and punctuation, replacing unknown
words, removing redundant words, ignoring stylistic
issues
Full
• Publishing and engine training
– Fixing meaning distortion, fixing grammar and syntax,
translating untranslated terms (possibly new terms),
adjusting fluency
45. Vetting and
training editors
Tests not applicable
• Dedicated or properly-filtered vendor base
– Previous experience
– Specific certifications
– Domain expertise
– Ability to follow instructions and style guides
– Ability to process linguistic data
Specific training
• Specific engines
• Clients served
• Instructions
46. Dos
Test before operating
Provide MT samples for negotiation
Negotiate throughput rates
Provide glossary (with DNT words)
Provide instructions
Provide feedback forms
47. Don’ts
Use MT to curb the pressure on prices
Process poor MT outputs
Treat post-editing as fuzzy matches
49. Pricing and
compensation
Upstream
• Clear-cut predictive scheme
– No fuzzy match scheme
o Fuzzy match over 85% are inherently correct while MT segments may
contain errors and inaccuracies
Downstream
• Measurement of actual work
50. Negotiation
grid
Generals
• Engine
– Generic or trained
• Quality
– Raw output
– Expectations
• Formats and formatting
Compensation
• Per-word rate
– Productivity rate
• Hourly rate
– Time tracking