SlideShare a Scribd company logo
1 of 13
Download to read offline
Apertium: Free/open-source rule-based machine
translation and language processors
Mikel L. Forcada
Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain
Riga TAUS Roundtable, June 1, 2016
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
What is Apertium?
Apertium (since 2005) is
a free/open-source platform for shallow-transfer rule-based machine
translation
which is collaboratively developed
and provides:
A congurable, language independent machine translation engine,
Data (dictionaries, rules) for more than 40 language pairs (in XML
and text-based formats), and
lots of tools for developers and users.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
Pipeline architecture
A pipelined architecture allows for easy customization and diagnostics.
lexical
transfer
morph.
analyser
morph.
disambig.
morph.
generator
post-
generator
SL
text
TL
text
deformatter
reformatter
structural
transfer
lexical
selection
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
Languages and language pairs
afr
nld
arg
cat
ita
bre
fra
spa
cym
eng
glg
dan
nno
nob
ast por ron
epo eus
hbs
mkd slv
bul
ind
zsmisl
swe
kaz
tat
mlt
ara
oci
sme
urd
hin
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
Apertium loves small languages
Some unique MT systems for small languages:
Breton→French Aragonese↔Spanish
Occitan↔Catalan Aragonese↔Catalan
Occitan↔Spanish North Sámi→Norwegian
To love is to give: e.g. provide small languages with
language resources, and
computational-linguistic descriptions of their language.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium good for?
What is Apertium good for?
Apertium is basically good to translate between related languages. Some
examples in Apertium:
Spanish ↔ Portuguese
Norwegian Nynorsk ↔ Norwegian Bokmål
Slovenian ↔ Croatian
Tatar ↔ Kazakh
Postediting Apertium output in these cases may save time compared to
translation from scratch.
It is also being used for less-related language pairs in gisting applications.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Apertium is collaboratively developed
Apertium licensing: free/open-source
Apertium language data and code are both licensed under the GNU
General Public License:
a free/open-source license allowing free distribution of unmodied and
modied versions
a copylefted license: it avoids private appropriation and encourages
giving improvements back to the project (a commons) → community
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Apertium is collaboratively developed
Apertium is collaboratively developed
Very active group of hundreds of developers (freelance developers,
researchers, industrial partners).
Wiki documentation (wiki.apertium.org) in addition to formal
documents.
Help available at IRC channel #apertium in freenode.net
Mailing lists: apertium-stuff@lists.sf.net and other
language-specic lists
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Apertium is collaboratively developed
Research and business with Apertium
Apertium is already an active research and business platform:
Research: 40+ publications, 2 PhD thesis, 4 master's theses
Business: companies (Prompsit, Eleka, Imaxin Software, etc.)
oering services to customers such as Autodesk, the Government of
Catalonia, one of the main Basque banks, the daily newspaper La Voz
de Galicia, etc.)
The free/open-source model creates a community which eectively
connects researchers, developers, vendors and users.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Becoming an Apertium user
Becoming an Apertium user
Professional translators can:
use Apertium oine plugins in the OmegaT free/open-source CAT
environment.
(as with any other system) easily align source and MT to generate
machine translation memories to feed into other CAT systems
Muggles can use:
a stand-alone Java application for the desktop: apertium-caffeine
an Android version for handhelds
a stand-alone version (Apertium Simpleton) for Windows and MacOS.
a plug-in for the OmegaT CAT platform apertium-omegat
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Becoming an Apertium developer
Becoming an Apertium developer
It's easy to become an Apertium developer. It just takes
reasonable computing skills (XML, shell commands, etc.), which are
not too hard to acquire,
good translation skills.
In no time, developers nd themselves contributing to a language pair with
the support of the community.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
A nice side eect: monolingual resources
A nice side eect: monolingual resources
When developing a language pair, monolingual language resources are
developed, such as
morphological dictionaries
morphological disambiguation rules and probabilities
The corresponding monolingual processors are available to help statistical
machine translation deal, for instance, with languages having a challenging
morphology.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Success cases
Success cases
Apertium a is mature technology which is used:
in Wikimedia Content Translation to generate Wikipedia content in
other languages,
to produce a Catalan edition of Valencia daily newspaper
Levante-EMV,
by Universities in the Catalan speaking area to help in the generation
of courseware and academic information,
in PLATA, the Spanish government platform for on-the-y webpage
machine translation of public-service webpages.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13

More Related Content

Similar to Apertium: Free/open-source rule-based machine translation and language processors, Mikel L. Forcada, Universitat d'Alacant, Spain

Economic aspects and business models of free software
Economic aspects and business models of free softwareEconomic aspects and business models of free software
Economic aspects and business models of free softwarermvvr143
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Prompsit Language Engineering
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Gema Ramirez-Sanchez
 
Open-source machine translation for Icelandic: the Apertium platform as an o...
Open-source machine translation for Icelandic:
 the Apertium platform as an o...Open-source machine translation for Icelandic:
 the Apertium platform as an o...
Open-source machine translation for Icelandic: the Apertium platform as an o...Forcada Mikel
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual EuropeGeorg Rehm
 
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...Gunther Eysenbach
 
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. 2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. IMPACT Centre of Competence
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana
 
Galician Experience with OpenOffice.org
Galician Experience with OpenOffice.orgGalician Experience with OpenOffice.org
Galician Experience with OpenOffice.orgAlexandro Colorado
 
Multilingualism for Digital Europe
Multilingualism for Digital EuropeMultilingualism for Digital Europe
Multilingualism for Digital EuropeGeorg Rehm
 
Liberate Your Library Building A Scottish Consortium November 16th 2009
Liberate Your Library   Building A Scottish Consortium November 16th 2009Liberate Your Library   Building A Scottish Consortium November 16th 2009
Liberate Your Library Building A Scottish Consortium November 16th 2009Jonathan Field
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic RIILP
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Georg Rehm
 

Similar to Apertium: Free/open-source rule-based machine translation and language processors, Mikel L. Forcada, Universitat d'Alacant, Spain (20)

Economic aspects and business models of free software
Economic aspects and business models of free softwareEconomic aspects and business models of free software
Economic aspects and business models of free software
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
Open-source machine translation for Icelandic: the Apertium platform as an o...
Open-source machine translation for Icelandic:
 the Apertium platform as an o...Open-source machine translation for Icelandic:
 the Apertium platform as an o...
Open-source machine translation for Icelandic: the Apertium platform as an o...
 
Socializing and disseminating the academic and intellectual creation: experie...
Socializing and disseminating the academic and intellectual creation: experie...Socializing and disseminating the academic and intellectual creation: experie...
Socializing and disseminating the academic and intellectual creation: experie...
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual Europe
 
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
 
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. 2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
Presentation Prompsit Apertium Oswc 2012
Presentation Prompsit Apertium Oswc 2012Presentation Prompsit Apertium Oswc 2012
Presentation Prompsit Apertium Oswc 2012
 
Achievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An LocAchievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An Loc
 
Galician Experience with OpenOffice.org
Galician Experience with OpenOffice.orgGalician Experience with OpenOffice.org
Galician Experience with OpenOffice.org
 
Socializing and disseminating the academic and intellectual creation: Experie...
Socializing and disseminating the academic and intellectual creation: Experie...Socializing and disseminating the academic and intellectual creation: Experie...
Socializing and disseminating the academic and intellectual creation: Experie...
 
Multilingualism for Digital Europe
Multilingualism for Digital EuropeMultilingualism for Digital Europe
Multilingualism for Digital Europe
 
Liberate Your Library Building A Scottish Consortium November 16th 2009
Liberate Your Library   Building A Scottish Consortium November 16th 2009Liberate Your Library   Building A Scottish Consortium November 16th 2009
Liberate Your Library Building A Scottish Consortium November 16th 2009
 
Niatalk24jan10
Niatalk24jan10Niatalk24jan10
Niatalk24jan10
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Concordances
Concordances Concordances
Concordances
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
 

More from TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 

More from TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 

Recently uploaded

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...amilabibi1
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Delhi Call girls
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsaqsarehman5055
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Pooja Nehwal
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedDelhi Call girls
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lodhisaajjda
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 

Recently uploaded (20)

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 

Apertium: Free/open-source rule-based machine translation and language processors, Mikel L. Forcada, Universitat d'Alacant, Spain

  • 1. Apertium: Free/open-source rule-based machine translation and language processors Mikel L. Forcada Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain Riga TAUS Roundtable, June 1, 2016 Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 2. What is Apertium? What is Apertium? Apertium (since 2005) is a free/open-source platform for shallow-transfer rule-based machine translation which is collaboratively developed and provides: A congurable, language independent machine translation engine, Data (dictionaries, rules) for more than 40 language pairs (in XML and text-based formats), and lots of tools for developers and users. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 3. What is Apertium? Pipeline architecture A pipelined architecture allows for easy customization and diagnostics. lexical transfer morph. analyser morph. disambig. morph. generator post- generator SL text TL text deformatter reformatter structural transfer lexical selection Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 4. What is Apertium? Languages and language pairs afr nld arg cat ita bre fra spa cym eng glg dan nno nob ast por ron epo eus hbs mkd slv bul ind zsmisl swe kaz tat mlt ara oci sme urd hin Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 5. What is Apertium? Apertium loves small languages Some unique MT systems for small languages: Breton→French Aragonese↔Spanish Occitan↔Catalan Aragonese↔Catalan Occitan↔Spanish North Sámi→Norwegian To love is to give: e.g. provide small languages with language resources, and computational-linguistic descriptions of their language. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 6. What is Apertium good for? What is Apertium good for? Apertium is basically good to translate between related languages. Some examples in Apertium: Spanish ↔ Portuguese Norwegian Nynorsk ↔ Norwegian Bokmål Slovenian ↔ Croatian Tatar ↔ Kazakh Postediting Apertium output in these cases may save time compared to translation from scratch. It is also being used for less-related language pairs in gisting applications. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 7. Apertium is collaboratively developed Apertium licensing: free/open-source Apertium language data and code are both licensed under the GNU General Public License: a free/open-source license allowing free distribution of unmodied and modied versions a copylefted license: it avoids private appropriation and encourages giving improvements back to the project (a commons) → community Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 8. Apertium is collaboratively developed Apertium is collaboratively developed Very active group of hundreds of developers (freelance developers, researchers, industrial partners). Wiki documentation (wiki.apertium.org) in addition to formal documents. Help available at IRC channel #apertium in freenode.net Mailing lists: apertium-stuff@lists.sf.net and other language-specic lists Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 9. Apertium is collaboratively developed Research and business with Apertium Apertium is already an active research and business platform: Research: 40+ publications, 2 PhD thesis, 4 master's theses Business: companies (Prompsit, Eleka, Imaxin Software, etc.) oering services to customers such as Autodesk, the Government of Catalonia, one of the main Basque banks, the daily newspaper La Voz de Galicia, etc.) The free/open-source model creates a community which eectively connects researchers, developers, vendors and users. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 10. Becoming an Apertium user Becoming an Apertium user Professional translators can: use Apertium oine plugins in the OmegaT free/open-source CAT environment. (as with any other system) easily align source and MT to generate machine translation memories to feed into other CAT systems Muggles can use: a stand-alone Java application for the desktop: apertium-caffeine an Android version for handhelds a stand-alone version (Apertium Simpleton) for Windows and MacOS. a plug-in for the OmegaT CAT platform apertium-omegat Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 11. Becoming an Apertium developer Becoming an Apertium developer It's easy to become an Apertium developer. It just takes reasonable computing skills (XML, shell commands, etc.), which are not too hard to acquire, good translation skills. In no time, developers nd themselves contributing to a language pair with the support of the community. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 12. A nice side eect: monolingual resources A nice side eect: monolingual resources When developing a language pair, monolingual language resources are developed, such as morphological dictionaries morphological disambiguation rules and probabilities The corresponding monolingual processors are available to help statistical machine translation deal, for instance, with languages having a challenging morphology. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 13. Success cases Success cases Apertium a is mature technology which is used: in Wikimedia Content Translation to generate Wikipedia content in other languages, to produce a Catalan edition of Valencia daily newspaper Levante-EMV, by Universities in the Catalan speaking area to help in the generation of courseware and academic information, in PLATA, the Spanish government platform for on-the-y webpage machine translation of public-service webpages. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13