Getting the Most from MT + PE

Getting the Most
from MT + PE
Reducing turn-around times and maximizing cost-
effectiveness

Overview
 Laying foundations
 Defining requirements
 Arranging the platform
 Running projects

Laying foundations
Devising strategies

Modes of use
Fully
automatic
High
quality
Unrestricted
texts
Restricted
input
Impractical
Low
quality Interactive

Cobbler, stick to thy
last
Zapatero, a tus zapatos
Schuster, bleib bei deinem Leisten!
Schoenmaker, blijf bij je leest
Ne supra crepidam sutor iudicaret

Not exactly
child’s play
 Technology
 Data
 Skills

A typical
translation
industry
mindset

Would you ask
a barber if you
need a haircut?

3 tips for
getting started
1. Recap your goals and requirements
2. Hire an independent consultant
3. Qualify your business processes for MT

Defining
requirements
KISS, or Keep It Short and Simple

Building blocks
 Scope
 Goals
 Expectations

 E-services
• Knowledge bases, assistance-and-support
pages, intranets
• Real-time communication
 Productivity
• User support, technical and user
documentation
 Intelligence
• Text mining and analysis, research, CRM,
patents
 Communication
• Emails, messaging, chats
Scope

 Reduce cost
• By cutting labor
 Boost productivity
• Greater volumes, faster delivery
 Improve consistency
• Of terminology and style
 Reach a global audience
Goals

Expectations
 Financial
• Develop business
• Increase revenue
• Increase profits
 Business
• Expand offering
• Increase service levels
• Streamline processes
 Performance
• Boost productivity
• Deliver faster
• Offer higher quality

Requirements
 Spending cap
 Timeline
 Technology
 Security
 Expertise
 Reporting & analytics
 Support

Trust the
consultant
 Reconcile goals and expectations
 Outline an exploratory program
 Benchmark performance
 Prepare the specs
 Draft a SOW
 Write the RFP
 Vet vendors
 Prepare your data
 Revise business and pricing models
 Retrofit processes

Pick the right
vendor
Not all beers are created equal.
(Not all vendors are created equal.)

Prepare
 Data
 Configuration plan
 Training/recruiting program

Be realistic MT grants no immunity to price pressure

Note the trend
More and more, intermediaries are cut out

Building the
platform
Selecting vendors, completing set-up, training staff,
testing

Givens
 Not all engines are created equal
 Raw output can vary across systems—
and language pairs
 Errors may not follow a consistent
pattern
 Engine performance also varies

Set-up
 Data
• Maintenance
 Customized engine
• +50,000 segments
 Tool settings
• Sub-segment recall
• Fuzzy match repair

In-house or
outsourced?
 Total cost of ownership
 Integration
 In-house expertise
 Confidentiality
 Intellectual property issues

Best practices
Running projects

Key questions
 Buyer or vendor?
 Dos & Don’ts?
 How do I deal with data?
 How do I assess quality?
 How do I hire staff?
 What about post-editing?

Dos
 Know your data
 Consider training necessary
 Leverage quality evaluation metrics
 Define AQLs
 Plan for continuous improvement
 Arrange for post-editing
 Devise a compensation scheme

Don’ts
 Treat all content equally
 DIY
 Rely on vendors only
 Mess with data
 Trust one single metric
 Rush
 Mess with staff
 Expect miracles

And remember:
Tell the customer you are using MT
So you won’t get sued

The fuel Output is only as good as the data used

Quantity and
quality
 1,000,000 words/50,000 segments
• No contiguous/inclusive domains
 More data  higher quality
• Good data

Good data
 Few reliable sources
 Single domain
 Current data
 Same encoding
 No empty segments
 No errors
 Terminologically consistent segments
 Same style
 Same-length segments

The output Accept that output is unpredictable

Automatic
metrics
Use all available automatic metrics

Post-editing:
expectations
1. Fast
2. Unchallenging
3. Flowing

Post-editing:
measures
 Edit Time
• The time required to get a raw MT output to
the desired standard
 Post-editing effort
• Percentage of edits to be applied to raw MT
output to attain the desired standard

Can only be computed downstreamEdit time

Post‐editing
effort
 Probabilistic forecasts
• Based on automatic metrics
 Depending on
• Post‐editing level
• Volume
• Turn‐around time

Post-editing
levels
 Gisting
• Volatile content
– Automatic scripts to fix mechanical/recurring errors
 Light
• Continuous delivery
– Fixing capitalization and punctuation, replacing unknown
words, removing redundant words, ignoring stylistic
issues
 Full
• Publishing and engine training
– Fixing meaning distortion, fixing grammar and syntax,
translating untranslated terms (possibly new terms),
adjusting fluency

Vetting and
training editors
 Tests not applicable
• Dedicated or properly-filtered vendor base
– Previous experience
– Specific certifications
– Domain expertise
– Ability to follow instructions and style guides
– Ability to process linguistic data
 Specific training
• Specific engines
• Clients served
• Instructions

Dos
 Test before operating
 Provide MT samples for negotiation
 Negotiate throughput rates
 Provide glossary (with DNT words)
 Provide instructions
 Provide feedback forms

Don’ts
 Use MT to curb the pressure on prices
 Process poor MT outputs
 Treat post-editing as fuzzy matches

Post-editing
instructions
 Tool selection
 Environment setup
 General references
 Conventions
 Project details
 Pricing model
 Operating instructions

Pricing and
compensation
 Upstream
• Clear-cut predictive scheme
– No fuzzy match scheme
o Fuzzy match over 85% are inherently correct while MT segments may
contain errors and inaccuracies
 Downstream
• Measurement of actual work

Negotiation
grid
 Generals
• Engine
– Generic or trained
• Quality
– Raw output
– Expectations
• Formats and formatting
 Compensation
• Per-word rate
– Productivity rate
• Hourly rate
– Time tracking

Automatic
processing
 Empty and/or untranslated segments
 Duplications
 Punctuation, diacritics, extra spaces,
noise
 Numbers, dates, weights, measures
 Terminology
 Spellcheck

Automatic tasks
Pre-processing
 Segmentation
 Normalization
 Formatting
 Terminology
Post-processing
 Encoding
 Normalization
 Formatting (tag injection)
 Terminology

Getting the Most from MT + PE

Recommandé

Recommandé

Contenu connexe

Similaire à Getting the Most from MT + PE

Similaire à Getting the Most from MT + PE (20)

Plus de Luigi Muzii

Plus de Luigi Muzii (20)

Dernier

Dernier (20)

Getting the Most from MT + PE