TM Town - TAUS Tokyo Forum 2015

•

0 j'aime•457 vues

TAUS - The Language Data Network

Kevin Dias from TM-Town presents at the TAUS Tokyo Executive Forum on 9-10 April, 2015

Présentations et discours publics

TAUS Executive Forum 2015
April 9-10, Tokyo

TM-Town's mission is to create a
better translation world through
technology and specialization.

Job matching based on natural
language processing of prior work
Incoming
translation
job
matching
algorithm
1. work is loaded
2. loaded work can be
leveraged
3. the system "learns"
each translators' areas
of experience
4. better matching

A bank for your linguistic assets
1. Safekeeping
2. Access anywhere
3. Earn interest

For translators
• a place to store and manage your TM's and glossaries, and do cool and useful
things with them
• a place to potentially connect with “spot on” clients simply by allowing your prior
work to speak for itself - without a word of it being disclosed
For translation buyers
For translation companies
• a place to connect with “spot on” translators based on the material you need
translated -- without needing to disclose the work before selection
• as the work is done by specialists, and the matching process is more automated
and more accurate, benefits may be obtained in the areas of quality, price and
turnaround
• a tool to enable your project managers or vendor managers to more quickly, more
easily and more accurately select the best people for new translation jobs

A translation enablement platform
- what’s that?
• Unlimited private storage of TMs & glossaries
• TM & glossary analytics
• Term extraction
• Automatic alignment
• Easy file conversion (TMX, XLIFF, XLS, CSV)
• Ability to share term glossaries
• A powerful API
• Integration with CAT / TEnT tools
• Job matching on the basis of your prior work

Dropbox for translators
with added benefits

Automatic term extraction
and glossary creation

A powerful API
• Easy integration with other CAT tools
• Well-documented
• Public and private endpoints

Privacy
• 3 pieces of metadata are public
1. Language pair(s)
2. Field(s) of expertise you select
3. Number of translation units or term concepts
• The content of any document you upload is
automatically private and secure

Segmentation - why reinvent the
wheel?
• Most segmentation libraries are built to support
only English (or English plus a few other
languages)
• Current solutions do not handle ill-formatted
content well
• Some libraries perform really well when trained
with data in a specific language and a specific
domain, but what happens when your data
could come from any language and/or domain

A comparison of segmentation
libraries
Name Language License
Golden Rule Score !
(English)
Golden Rule Score
(Other Languages)
Speed
Pragmatic Segmenter Ruby MIT 98.08% 100.00% 3.84 s
TactfulTokenizer Ruby GNU GPLv3 65.38% 48.57% 46.32 s
Open NLP Java APLv2 59.62% 45.71% 1.27 s
Stanford CoreNLP Java GNU GPLv3 59.62% 31.43% 0.92 s
Splitta Python APLv2 55.77% 37.14% N/A
Punkt Python APLv2 46.15% 48.57% 1.79 s
SRX English Ruby GNU GPLv3 30.77% 28.57% 6.19 s
Scapel Ruby GNU GPLv3 28.85% 20.00% 0.13 s

Bitext alignment - areas for
improvement
• Early misalignment compounds into
errors throughout
• Accuracy may suﬀer for non-Roman
languages unless the algorithm is
properly tuned
• Does not handle cross alignments nor
uneven alignments

A method for higher accuracy
• Machine translate A - B and B - A
• Relative sentence length
• Order or position in the document
0 1 2 3 4 5
0
1 X
2 X
3
4 X
5 X
X

kevin@tm-town.com
a better translation world through
technology and specialization
Kevin Dias

Contenu connexe

En vedette

Microsoft - P3 - TAUS Tokyo Forum 2015TAUS - The Language Data Network

Crestec - TAUS Tokyo Forum 2015TAUS - The Language Data Network

TAUS Quality Dashboard: Use Cases and Integrations - Alessandro Cattelan (Tra...TAUS - The Language Data Network

Quality Dashboard RoadmapTAUS - The Language Data Network

Quality dashboard-may-2015TAUS - The Language Data Network

The Language Economy or the Lang “gig” Economy? Crowdsourcing Comes of AgeTAUS - The Language Data Network

En vedette (6)

Microsoft - P3 - TAUS Tokyo Forum 2015

Crestec - TAUS Tokyo Forum 2015

TAUS Quality Dashboard: Use Cases and Integrations - Alessandro Cattelan (Tra...

Quality Dashboard Roadmap

Quality dashboard-may-2015

The Language Economy or the Lang “gig” Economy? Crowdsourcing Comes of Age

Similaire à TM Town - TAUS Tokyo Forum 2015

TM-Town TAUS Translation Technology Webinar (April 2015)Kevin Dias

Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Kevin Dias

TM-Town - Getting the Most out of Your Translation MemoriesKevin Dias

Translation technology plugging the gaps_ecpdLucinda Brooks

TAUS Webinar - Introduction to the Gengo API EcosystemGengo

Opening the Black Box of Software LocalizationKenneth Farrall

Lean and Collaborative Content - WorkshopIXIASOFT

Natural language processing and searchNathan McMinn

Introducing cat toolsAdrian Brand

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 ...TAUS - The Language Data Network

Programming Languages #devcon2013Iván Montes

Introduction to Python Programming BasicsDhana malar

Design Like a Pro: Scripting Best PracticesInductive Automation

PYTHON UNIT 1nagendrasai12

Python programming ppt.pptxnagendrasai12

SDL Trados Studio 2014... what's new?SDL Trados

Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...lucenerevolution

Extending Solr: Building a Cloud-like Knowledge Discovery PlatformLucidworks (Archived)

Tm challengesITIRussia

Similaire à TM Town - TAUS Tokyo Forum 2015 (20)

TM-Town TAUS Translation Technology Webinar (April 2015)

Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...

TM-Town - Getting the Most out of Your Translation Memories

Translation technology plugging the gaps_ecpd

TAUS Webinar - Introduction to the Gengo API Ecosystem

Opening the Black Box of Software Localization

Lean and Collaborative Content - Workshop

Natural language processing and search

Introducing cat tools

TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Monaco, Joel Sigling, AVB, 25 ...

Programming Languages #devcon2013

Introduction to Python Programming Basics

Design Like a Pro: Scripting Best Practices

PYTHON UNIT 1

Python programming ppt.pptx

SDL Trados Studio 2014... what's new?

Extending Solr: Behind CareerBuilder’s Cloud-like Knowledge Discovery Platfor...

Extending Solr: Building a Cloud-like Knowledge Discovery Platform

Tm challenges

Plus de TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network

Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network

Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network

Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network

A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network

Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network

Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network

Farmer Lv (TrueTran)TAUS - The Language Data Network

Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network

The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network

Translation Technology Showcase in ShenzhenTAUS - The Language Data Network

How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network

SDL Trados Studio 2017, Jocelyn He (SDL)TAUS - The Language Data Network

How we train post-editors - Yongpeng Wei (Lingosail)TAUS - The Language Data Network

A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network

QE integrated in XTM, by Bob Willans (XTM)TAUS - The Language Data Network

Plus de TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...

TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...

TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...

TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...

TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...

Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...

Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)

Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...

A translation memory P2P trading platform - to make global translation memory...

Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...

Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...

Farmer Lv (TrueTran)

Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...

The Theory and Practice of Computer Aided Translation Training System, Liu Q...

Translation Technology Showcase in Shenzhen

How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)

SDL Trados Studio 2017, Jocelyn He (SDL)

How we train post-editors - Yongpeng Wei (Lingosail)

A use-case for getting MT into your company, Kerstin Berns (berns language c...

QE integrated in XTM, by Bob Willans (XTM)

Dernier

If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi

Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls

Mathematics of Finance Presentation.pptxMoumonDas2

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen

ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2

SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany

Presentation on Engagement in Book Clubssamaasim06

Report Writing Webinar TrainingKylaCullinane

Thirunelveli call girls Tamil escorts 7877702510Vipesco

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal

Microsoft Copilot AI for Everyone - created by AITatiana Gurgel

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt

Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls

George Lever - eCommerce Day Chile 2024eCommerce Institute

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls

Dernier (20)

If this Giant Must Walk: A Manifesto for a New Nigeria

Introduction to Prompt Engineering (Focusing on ChatGPT)

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service

Mathematics of Finance Presentation.pptx

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...

ANCHORING SCRIPT FOR A CULTURAL EVENT.docx

SaaStr Workshop Wednesday w/ Lucas Price, Yardstick

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...

Presentation on Engagement in Book Clubs

Report Writing Webinar Training

Thirunelveli call girls Tamil escorts 7877702510

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy

Microsoft Copilot AI for Everyone - created by AI

The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf

Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service

George Lever - eCommerce Day Chile 2024

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services

Night 7k Call Girls Noida Sector 128 Call Me: 8448380779

TM Town - TAUS Tokyo Forum 2015

1. TAUS Executive Forum 2015 April 9-10, Tokyo

2. TM-Town's mission is to create a better translation world through technology and specialization.

3. Job matching based on natural language processing of prior work Incoming translation job matching algorithm 1. work is loaded 2. loaded work can be leveraged 3. the system "learns" each translators' areas of experience 4. better matching

4. A bank for your linguistic assets 1. Safekeeping 2. Access anywhere 3. Earn interest

5. For translators • a place to store and manage your TM's and glossaries, and do cool and useful things with them • a place to potentially connect with “spot on” clients simply by allowing your prior work to speak for itself - without a word of it being disclosed For translation buyers For translation companies • a place to connect with “spot on” translators based on the material you need translated -- without needing to disclose the work before selection • as the work is done by specialists, and the matching process is more automated and more accurate, benefits may be obtained in the areas of quality, price and turnaround • a tool to enable your project managers or vendor managers to more quickly, more easily and more accurately select the best people for new translation jobs

6. A translation enablement platform - what’s that? • Unlimited private storage of TMs & glossaries • TM & glossary analytics • Term extraction • Automatic alignment • Easy file conversion (TMX, XLIFF, XLS, CSV) • Ability to share term glossaries • A powerful API • Integration with CAT / TEnT tools • Job matching on the basis of your prior work

7. Dropbox for translators with added benefits

8. Automatic term extraction and glossary creation

9. Productivity analysis

10. A powerful API • Easy integration with other CAT tools • Well-documented • Public and private endpoints

11. API integration in CAT tools

12. Public profile

13. Privacy • 3 pieces of metadata are public 1. Language pair(s) 2. Field(s) of expertise you select 3. Number of translation units or term concepts • The content of any document you upload is automatically private and secure

14. Segmentation - why reinvent the wheel? • Most segmentation libraries are built to support only English (or English plus a few other languages) • Current solutions do not handle ill-formatted content well • Some libraries perform really well when trained with data in a specific language and a specific domain, but what happens when your data could come from any language and/or domain

15. A comparison of segmentation libraries Name Language License Golden Rule Score ! (English) Golden Rule Score (Other Languages) Speed Pragmatic Segmenter Ruby MIT 98.08% 100.00% 3.84 s TactfulTokenizer Ruby GNU GPLv3 65.38% 48.57% 46.32 s Open NLP Java APLv2 59.62% 45.71% 1.27 s Stanford CoreNLP Java GNU GPLv3 59.62% 31.43% 0.92 s Splitta Python APLv2 55.77% 37.14% N/A Punkt Python APLv2 46.15% 48.57% 1.79 s SRX English Ruby GNU GPLv3 30.77% 28.57% 6.19 s Scapel Ruby GNU GPLv3 28.85% 20.00% 0.13 s

16. Bitext alignment - areas for improvement • Early misalignment compounds into errors throughout • Accuracy may suﬀer for non-Roman languages unless the algorithm is properly tuned • Does not handle cross alignments nor uneven alignments

17. A method for higher accuracy • Machine translate A - B and B - A • Relative sentence length • Order or position in the document 0 1 2 3 4 5 0 1 X 2 X 3 4 X 5 X X

18. kevin@tm-town.com a better translation world through technology and specialization Kevin Dias

TM Town - TAUS Tokyo Forum 2015

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (6)

Similaire à TM Town - TAUS Tokyo Forum 2015

Similaire à TM Town - TAUS Tokyo Forum 2015 (20)

Plus de TAUS - The Language Data Network

Plus de TAUS - The Language Data Network (20)

Dernier

Dernier (20)

TM Town - TAUS Tokyo Forum 2015