SlideShare une entreprise Scribd logo
1  sur  23
TAUS USER CONFERENCE 2010
LANGUAGE BUSINESS INNOVATION
4 – 6 OCTOBER / PORTLAND (OR), USA




TUESDAY 5 OCTOBER / 14.45

TURBO-CHARGE RULE-BASED MACHINE
TRANSLATION PRODUCTIVITY BY IMPROVING
SOURCE TEXT
Lori Thicke, Lexcelera
Machine Translation is capable of
increasing speed, lowering costs and
yes, even improving quality.
Machine Translation is capable of
increasing speed, lowering costs and
yes, even improving quality.


              … but not out of the box.
Optimization can mean:
 Training the engine
 Improving the engine
 Improving the source
 Correcting the target
Optimization can mean:
 Training the engine
 Improving the engine
 Improving the source
 Correcting the target
What optimizations give you the
“biggest bang”, and how do we
measure this?
Post-Editing productivity
Quality of the raw output     speed of
post-editing     cost savings
We asked the question: How can
improving the source influence post-
editing productivity using Systran’s
Hybrid engine?
 Relevance of each guideline to MT, human translation and non-native
speakers
 Impact (1-3)
Leader in business analytics software and services
      Largest independent business intelligence
     vendor
            #1 on FORTUNE’s “Best Places to Work
            in America” list (2010)
Project Setup
Systran 7.0 Hybrid

 Help document, not “acrochecked”

 880 words

 120 SAS glossary terms
2 versions of the source document:
    o Unedited
    o Edited


 4 scenarios
   o untrained MT engine        unedited source document
   o                            edited source document
   o trained MT engine          unedited source document
   o                            edited source document

 Each file post-edited separately and time thoroughly
tracked
Results

 Post-Editing Productivity was 5X faster with both
   a trained engine and edited source document
1. Use active verbs, avoid the gerund

Source                                         Target
       Example 1: Use verbs to convey the most significant actions to your sentences
Understanding the differences between La compréhension des différences entre
owned and checked out alerts is critical les alertes possédées et Extraites est
to understanding SAS® Anti-Money               critique au SAS® Anti-Money Laundering
Laundering.                                    de compréhension.

In order to understand SAS® Anti-Money     Afin de comprendre le SAS® Anti-Money
Laundering, you need to understand the     Laundering, vous devez comprendre les
differences between owned alerts and       différences entre les alertes détenues
checked out alerts.                        par un autre utilisateur et les alertes
                                           bloquées.

Afin de comprendre le fonctionnement de SAS® Anti-Money Laundering, vous
devez comprendre les différences entre les alertes détenues par un autre
utilisateur et les alertes bloquées.
2. Avoid the passive voice

Source                                          Target
        Example 1: Use verbs to convey the most significant actions to your sentences
Risk-factor-only alerts can be identified Des alertes de type facteur de risque
by the Scenario and Triggering Values           uniquement peuvent être identifiées par
columns on an alert list window.                le scénario et des colonnes Valeurs de
                                                déclenchement sur une fenêtre de listes
                                                des alertes.
To identify a risk-factor-only alert, the Pour identifier une alerte de type
Scenario column of the alert list window facteur de risque uniquement, la
displays either ML_Risk or TF_Risk.       colonne Scénario de la fenêtre de listes
                                          des alertes montre ML_Risk ou TF_Risk.
Pour identifier une alerte de type facteur de risque uniquement, la colonne
Scénario de la fenêtre de listes des alertes indique ML_Risk ou TF_Risk.
3. Begin with the prepositional phrase

Source                                          Target
        Example 1: Use verbs to convey the most significant actions to your sentences
Click Check Out in that alert's Availability Le clic Extraient dans la colonne
column on the Available Alerts window. Disponibilité de cette alerte sur la
                                                fenêtre Alertes disponibles.
In the Available Alerts window, click          Dans la fenêtre Alertes disponibles, le
Check Out in the alert's Availability          clic Extraient dans la colonne
column.                                        Disponibilité de l'alerte.
Dans la fenêtre Alertes disponibles, cliquez sur Extraire dans la colonne
Disponibilité de l'alerte.
4. Use short sentences with 1 idea
Source                                         Target
        Example 1: Use verbs to convey the most Des alertes sont your sentences
Alerts are displayed on alert list windows, significant actions tomontrées sur les fenêtres
which provide tools and information to aid de listes des alertes, qui fournissent des
users as they determine whether alerts          outils et des informations aux utilisateurs
represent suspicious activity that should       d'aide pendant qu'elles déterminent si les
be reported to authorities.                     alertes représentent l'activité suspecte qui
                                                devrait être rapportée aux autorités.
Alerts are displayed in alert list windows.    Des alertes sont montrées dans des
The alert list windows provide tools and       fenêtres de listes des alertes. Les fenêtres
information that help users determine          de listes des alertes fournissent les outils
whether alerts indicate suspicious activity    et les informations qui aident des
that should be reported to authorities.        utilisateurs à déterminer si les alertes
                                               indiquent l'activité suspecte qui devrait
                                               être rapportée aux autorités.
Les alertes s’affichent dans des fenêtres de listes des alertes. Les fenêtres de listes des
alertes fournissent les outils et les informations qui aident des utilisateurs à déterminer
si les alertes indiquent une activité suspecte qui devrait être signalée aux autorités.
How long to post-edit 880 words?
              55
 60                          50
 50

 40                                            33

 30                                                            24
                                                                        Post-edition
                                                                        time (in min.)
 20

 10

 0
      no MT training, no MT training,    with MT          with MT
        no source      with source     training, no    training, with
          editing         editing     source editing   source editing
… compared to a traditional translation

120

100

 80

 60

 40

 20

 0
Conclusions
              + A human translator = 120
                minutes
              + Untrained MT is 2X faster
              + + Trained MT is 4X faster
              +++ Trained MT with source
                control is 5X faster
A final word (or two) about MT Quality
“Contrary to all expectations, using
MT in Bentley has improved the
                                       French OLH reviewer: “I give a 9…I
translation quality in the pilot
                                       find this translation very good…I
projects.”
                                       found it better than the translations
                                       I used to see before.”
German courseware reviewer: “It was
the best translation of courseware I
ever read.”

                        Questions?
                    lori@lexcelera.com
TAUS USER CONFERENCE 2010, Turbo-charge rule based machine translation productivity by improving your source text

Contenu connexe

Similaire à TAUS USER CONFERENCE 2010, Turbo-charge rule based machine translation productivity by improving your source text

Who is watson?
Who is watson?Who is watson?
Who is watson?_unknowns
 
Panorama de l'offre de logiciels libres pour bibliothèque v2
Panorama de l'offre de logiciels libres pour bibliothèque v2Panorama de l'offre de logiciels libres pour bibliothèque v2
Panorama de l'offre de logiciels libres pour bibliothèque v2Marc Maisonneuve
 
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...Véronique Gambier
 
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...Véronique Gambier
 
DeviceMed 2015 - Gérer le risque lié à la documentation
DeviceMed 2015 - Gérer le risque lié à la documentationDeviceMed 2015 - Gérer le risque lié à la documentation
DeviceMed 2015 - Gérer le risque lié à la documentationJulie Renahy
 
Tutoriel de l'outil : Alerti
Tutoriel de l'outil : AlertiTutoriel de l'outil : Alerti
Tutoriel de l'outil : AlertiCell'IE
 
IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015
IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015
IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015IBM France Lab
 
2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic
2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic  2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic
2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic COMPETITIC
 
Guide Open Source Syntec Numérique
Guide Open Source Syntec NumériqueGuide Open Source Syntec Numérique
Guide Open Source Syntec NumériqueBruno Cornec
 
« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...
« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...
« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...Luc Desruelle
 
Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...
Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...
Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...RECOVEO
 
Conseils de survie pour hiérarchiser les cybermenaces
Conseils de survie pour hiérarchiser les cybermenacesConseils de survie pour hiérarchiser les cybermenaces
Conseils de survie pour hiérarchiser les cybermenacesOpen Source Experience
 
cours d'algorithmique et programmation 3sc final .pdf
cours d'algorithmique et programmation 3sc final .pdfcours d'algorithmique et programmation 3sc final .pdf
cours d'algorithmique et programmation 3sc final .pdfLamissGhoul1
 
Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...
Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...
Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...Valtech
 

Similaire à TAUS USER CONFERENCE 2010, Turbo-charge rule based machine translation productivity by improving your source text (20)

Cyberun #12
Cyberun #12Cyberun #12
Cyberun #12
 
Who is watson?
Who is watson?Who is watson?
Who is watson?
 
Panorama de l'offre de logiciels libres pour bibliothèque v2
Panorama de l'offre de logiciels libres pour bibliothèque v2Panorama de l'offre de logiciels libres pour bibliothèque v2
Panorama de l'offre de logiciels libres pour bibliothèque v2
 
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
 
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
Panoramadeloffredelogicielslibrespourbibliothque 150123082251-conversion-gate...
 
DeviceMed 2015 - Gérer le risque lié à la documentation
DeviceMed 2015 - Gérer le risque lié à la documentationDeviceMed 2015 - Gérer le risque lié à la documentation
DeviceMed 2015 - Gérer le risque lié à la documentation
 
Tutoriel de l'outil : Alerti
Tutoriel de l'outil : AlertiTutoriel de l'outil : Alerti
Tutoriel de l'outil : Alerti
 
IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015
IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015
IBM Paris Bluemix Meetup #12 - Ecole 42 - 9 décembre 2015
 
2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic
2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic  2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic
2011 02 17 Suivez votre marché et vos concurrents sur internet by competitic
 
Lexique digital
Lexique digitalLexique digital
Lexique digital
 
Fouille logiciel
Fouille logicielFouille logiciel
Fouille logiciel
 
Scapin et bastien
Scapin et bastienScapin et bastien
Scapin et bastien
 
Guide Open Source Syntec Numérique
Guide Open Source Syntec NumériqueGuide Open Source Syntec Numérique
Guide Open Source Syntec Numérique
 
« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...
« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...
« LabVIEW : programmation et applications » ou comment apprendre à utiliser L...
 
Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...
Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...
Livre Blanc collectivités : 10 conseils empiriques pour récupérer ses données...
 
Conseils de survie pour hiérarchiser les cybermenaces
Conseils de survie pour hiérarchiser les cybermenacesConseils de survie pour hiérarchiser les cybermenaces
Conseils de survie pour hiérarchiser les cybermenaces
 
cours d'algorithmique et programmation 3sc final .pdf
cours d'algorithmique et programmation 3sc final .pdfcours d'algorithmique et programmation 3sc final .pdf
cours d'algorithmique et programmation 3sc final .pdf
 
Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...
Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...
Eb02 Ergonomie Creation Graphique D Un Site Web Et Processus Agile De Develop...
 
Gestion ddes risques
Gestion ddes risquesGestion ddes risques
Gestion ddes risques
 
Guide open source-bdef
Guide open source-bdefGuide open source-bdef
Guide open source-bdef
 

Plus de TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 

Plus de TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 

TAUS USER CONFERENCE 2010, Turbo-charge rule based machine translation productivity by improving your source text

  • 1. TAUS USER CONFERENCE 2010 LANGUAGE BUSINESS INNOVATION 4 – 6 OCTOBER / PORTLAND (OR), USA TUESDAY 5 OCTOBER / 14.45 TURBO-CHARGE RULE-BASED MACHINE TRANSLATION PRODUCTIVITY BY IMPROVING SOURCE TEXT Lori Thicke, Lexcelera
  • 2. Machine Translation is capable of increasing speed, lowering costs and yes, even improving quality.
  • 3. Machine Translation is capable of increasing speed, lowering costs and yes, even improving quality. … but not out of the box.
  • 4. Optimization can mean:  Training the engine  Improving the engine  Improving the source  Correcting the target
  • 5. Optimization can mean:  Training the engine  Improving the engine  Improving the source  Correcting the target
  • 6. What optimizations give you the “biggest bang”, and how do we measure this?
  • 7. Post-Editing productivity Quality of the raw output speed of post-editing cost savings
  • 8. We asked the question: How can improving the source influence post- editing productivity using Systran’s Hybrid engine?
  • 9.
  • 10.  Relevance of each guideline to MT, human translation and non-native speakers  Impact (1-3)
  • 11. Leader in business analytics software and services Largest independent business intelligence vendor #1 on FORTUNE’s “Best Places to Work in America” list (2010)
  • 12. Project Setup Systran 7.0 Hybrid  Help document, not “acrochecked”  880 words  120 SAS glossary terms
  • 13. 2 versions of the source document: o Unedited o Edited  4 scenarios o untrained MT engine unedited source document o edited source document o trained MT engine unedited source document o edited source document  Each file post-edited separately and time thoroughly tracked
  • 14. Results Post-Editing Productivity was 5X faster with both a trained engine and edited source document
  • 15. 1. Use active verbs, avoid the gerund Source Target Example 1: Use verbs to convey the most significant actions to your sentences Understanding the differences between La compréhension des différences entre owned and checked out alerts is critical les alertes possédées et Extraites est to understanding SAS® Anti-Money critique au SAS® Anti-Money Laundering Laundering. de compréhension. In order to understand SAS® Anti-Money Afin de comprendre le SAS® Anti-Money Laundering, you need to understand the Laundering, vous devez comprendre les differences between owned alerts and différences entre les alertes détenues checked out alerts. par un autre utilisateur et les alertes bloquées. Afin de comprendre le fonctionnement de SAS® Anti-Money Laundering, vous devez comprendre les différences entre les alertes détenues par un autre utilisateur et les alertes bloquées.
  • 16. 2. Avoid the passive voice Source Target Example 1: Use verbs to convey the most significant actions to your sentences Risk-factor-only alerts can be identified Des alertes de type facteur de risque by the Scenario and Triggering Values uniquement peuvent être identifiées par columns on an alert list window. le scénario et des colonnes Valeurs de déclenchement sur une fenêtre de listes des alertes. To identify a risk-factor-only alert, the Pour identifier une alerte de type Scenario column of the alert list window facteur de risque uniquement, la displays either ML_Risk or TF_Risk. colonne Scénario de la fenêtre de listes des alertes montre ML_Risk ou TF_Risk. Pour identifier une alerte de type facteur de risque uniquement, la colonne Scénario de la fenêtre de listes des alertes indique ML_Risk ou TF_Risk.
  • 17. 3. Begin with the prepositional phrase Source Target Example 1: Use verbs to convey the most significant actions to your sentences Click Check Out in that alert's Availability Le clic Extraient dans la colonne column on the Available Alerts window. Disponibilité de cette alerte sur la fenêtre Alertes disponibles. In the Available Alerts window, click Dans la fenêtre Alertes disponibles, le Check Out in the alert's Availability clic Extraient dans la colonne column. Disponibilité de l'alerte. Dans la fenêtre Alertes disponibles, cliquez sur Extraire dans la colonne Disponibilité de l'alerte.
  • 18. 4. Use short sentences with 1 idea Source Target Example 1: Use verbs to convey the most Des alertes sont your sentences Alerts are displayed on alert list windows, significant actions tomontrées sur les fenêtres which provide tools and information to aid de listes des alertes, qui fournissent des users as they determine whether alerts outils et des informations aux utilisateurs represent suspicious activity that should d'aide pendant qu'elles déterminent si les be reported to authorities. alertes représentent l'activité suspecte qui devrait être rapportée aux autorités. Alerts are displayed in alert list windows. Des alertes sont montrées dans des The alert list windows provide tools and fenêtres de listes des alertes. Les fenêtres information that help users determine de listes des alertes fournissent les outils whether alerts indicate suspicious activity et les informations qui aident des that should be reported to authorities. utilisateurs à déterminer si les alertes indiquent l'activité suspecte qui devrait être rapportée aux autorités. Les alertes s’affichent dans des fenêtres de listes des alertes. Les fenêtres de listes des alertes fournissent les outils et les informations qui aident des utilisateurs à déterminer si les alertes indiquent une activité suspecte qui devrait être signalée aux autorités.
  • 19. How long to post-edit 880 words? 55 60 50 50 40 33 30 24 Post-edition time (in min.) 20 10 0 no MT training, no MT training, with MT with MT no source with source training, no training, with editing editing source editing source editing
  • 20. … compared to a traditional translation 120 100 80 60 40 20 0
  • 21. Conclusions + A human translator = 120 minutes + Untrained MT is 2X faster + + Trained MT is 4X faster +++ Trained MT with source control is 5X faster
  • 22. A final word (or two) about MT Quality “Contrary to all expectations, using MT in Bentley has improved the French OLH reviewer: “I give a 9…I translation quality in the pilot find this translation very good…I projects.” found it better than the translations I used to see before.” German courseware reviewer: “It was the best translation of courseware I ever read.” Questions? lori@lexcelera.com