Contenu connexe
Similaire à 2012 MosesCore GALA Monaco: Friendly Machine Translation (20)
2012 MosesCore GALA Monaco: Friendly Machine Translation
- 2. © 2012 #2
outline
before starting with machine translation
what happens when you go live
how to minimize the risks
practical hints + some numbers
- 3. © 2012 #3
is machine translation for us?
<LSP> <tauyou>
translation memories open-source corpora
previous documents documentation alignment
websites of clients public information
language-specific rules programming of rules
TAUS data terminology extraction
<some issues>
minimum amount of data
need for data classification
language pairs
- 4. © 2012 #4
for sure it is!
<data cleaning + selection>
translation tables and language models
data and parameters for tuning
test measures
<engines creation>
several + pruning afterwards
<engine validation>
by professional translators
<continuous improvement>
new files, new corpora, new rules, etc.
- 5. © 2012 #5
the production process (I)
statistical MT decoding
convert
file format
segment
text
NLP
tasks
tokenize
rewrite
source
lowercase
- 6. © 2012 #6
the production process (II)
statistical MT decoding
translated
file
reformat detokenize
rewrite
target
uppercaseevaluate
- 7. © 2012 #7
risk minimization
<tauyou>
quality metrics computation
<LSP>
time and cost analysis
<LSP> + <tauyou>
track the evolution over time
- 8. © 2012 #8
practical hints
bigger clients
languages
with highest translation volumes
with similar structure
with specific terminology/needs
MT-friendly translators
start moving
- 9. © 2012 #9
some numbers
more than 1,500 million words per month
in latin languages ES, FR, PT, CA, GA, IT, RO
EN as source or target is the star
ES, FR, DE, PT, IT, DA, SV, ZH, AR, JP...
LSPs are translating +3 million words per month
investment pays off if you translate
+50,000 words per month
- 10. © 2012 #10
Thanks!
// Diego Bartolomé, PhD
<address> C/ Les Planes 39 – 08201 Sabadell – Spain
<phone> +34 93 711 29 96
<cell> +34 670 331 225
<email> dbc@tauyou.com
<www> tauyou.com