SlideShare une entreprise Scribd logo
1  sur  6
Multilingual Online
Translation
Annual report 2010 - 2011
MOLTO’s goal is to develop a set of tools for translating texts between
multiple languages in real time with high quality. MOLTO uses domain-
specific semantic grammars and ontology-based interlinguas implemented
in GF (Grammatical Framework), a grammar formalism where multiple
languages are related by a common abstract syntax. GF has been applied
in several small-to-medium size domains, typically targeting up to ten
languages but MOLTO will scale this up in terms of productivity and
applicability by increasing the size of domains and the number of
languages. MOLTO aims to make the technology accessible for domain
experts without GF expertise and to reduce the effort needed for building a
translator to just extending a lexicon and writing a set of example
sentences.
The most research-intensive parts of MOLTO are the two-way
interoperability between ontology standards (OWL) and GF grammars, and
the extension of rule-based translation by statistical methods. The OWL-GF
interoperability will enable multilingual natural-language-based interaction
with machine-readable knowledge while the statistical methods will add
robustness to the system when desired.
MOLTO technology will be released as open-source libraries for standard
translation tools and web pages and thereby fit into standard workflows.
Non multa, sed multum
Contents
The project MOLTO - Multilingual
Online Translation, started on March 1,
2010 and will run for 36 months.
2
A few results have been already
achieved during the first year of the
project’s lifetime.
3
The MOLTO website, at http://molto-
project.eu, was setup to popularize the
MOLTO technologies and to help create
a MOLTO community of researchers
and commercial partners
5
During Summer 2011, the project will
release the Mathematical Grammar
Library for simple drill problems in
mathematics
6
2 MOLTO Annual Report - 2011
“Vivamus porta
est sed est.”
MOLTO's mission is to develop a set
of tools for translating texts between
multiple languages in real time with
high quality. MOLTO will use
multilingual grammars based on
semantic interlinguas.
Project Objective
The project MOLTO - Multilingual Online Translation,
started on March 1, 2010 and will run for 36 months. It
promises to develop a set of tools for translating texts
between multiple languages in real time with high
quality. MOLTO will use multilingual grammars based
on semantic interlinguas and statistical machine
translation to simplify the production of multilingual
documents without sacrificing the quality. The
interlinguas are based on domain semantics and are
equipped with reversible generation functions: namely
translation is obtained as a composition of parsing the
source language and generating the target language.
An implementation of this technology is provided by
GF [2], Grammatical Framework. GF technologies in
MOLTO are complemented by the use of ontologies,
such as used in the semantic web, and by methods of
statistical machine translation (SMT) for improving
robustness and extracting grammars from data.
Grammatical Framework. GF technologies in MOLTO
are complemented by the use of ontologies, such as
used in the semantic web, and by methods of
statistical machine translation (SMT) for improving
robustness and extracting grammars from data.
MOLTO is committed to dealing with 15 languages,
which includes 12 official languages of the European
Union - Bulgarian, Danish, Dutch, English, Finnish,
French, German, Italian, Polish, Romanian, Spanish,
and Swedish - and 3 other languages - Catalan,
Norwegian, and Russian. In addition, there is on-going
work on at least Arabic, Farsi, Hebrew, Hindi/Urdu,
Icelandic, Japanese, Latvian, Maltese, Portuguese,
Swahili, Tswana, and Turkish.
Tools like Systran (Babelfish) and Google Translate
are designed for consumers of information, but
MOLTO will mainly target the producers of information.
Hence, the quality of the MOLTO translations must be
good enough for, say, an e-commerce site to use in
translating their web pages automatically without the
fear that the message will change. Third-party
translation tools, possibly integrated in the browsers,
let potential customers discover, in their preferred
language, whether, for instance, an e-commerce page
written in French offers something of interest.
Customers understand that these translations are
approximate and will filter out imprecision. If, for
instance, the system has translated a price of 100
Euros to 100 Swedish Crowns (which equals 10
Euros), they will not insist to buy the product for that
price. But if a company had placed such a translation
on its website, then it might be committed to it.
There is a well-known trade-off in machine translation:
one cannot at the same time reach full coverage and
full precision. In this trade-off, Systran and Google
have opted for coverage whereas MOLTO opts for
precision in domains with a well-understood language.
Three such domains will be considered during the
MOLTO project: mathematical exercises, biomedical
patents, and museum object descriptions. The MOLTO
tools however will be applicable to other domains as
well. Examples of such domains could be e-commerce
sites, Wikipedia articles, contracts, business letters,
user manuals, and software localization.
3 MOLTO Annual Report - 2011
The expected final product of
MOLTO is an open-source
software toolkit consisting in:
• grammar development tool,
as an IDE and an API, to
allow use as a plug-in to
web browsers, translation
tools, etc, for easy
construction and
improvement of translation
systems and the integration
of ontologies with grammars
• translator’s tool, as an API
and some interfaces in web
browsers and translation
tools
• grammar libraries for
linguistic resources, and for
the domains of patents,
mathematics, and cultural
heritage
A few results have been already achieved during the first year of the
project’s lifetime. Two applications of the MOLTO translation web
services are online on the project web pages:
1. The travel phrasebook1 translates sentences to 14 different
languages and shows some of the major end-user features
available to MOLTO users: predictive typing and JavaScript-
based GUI. Predictive typing prompts the user with the next
available choices mandated by the underlying grammar and
offers quasi-incremental translations of intermediate results from
words or complete sentences. JavaScript-based GUI using off-
the-shelf functions can be readily deployed on any device where
a browser is available.
2. The MOLTO KRI2, Knowledge Reasoning Infrastructure,
demonstrates the possibility of adding a natural language query
language to retrieve answers from an OWL database. In this
way, a query like Give me information about all organizations
located in Europe is interpreted as the machine understandable
SPARQL statement:
SELECT DISTINCT ?organization ?organization_label
WHERE { ?organization . ?organization
?organizationloc. ?organizationloc
“Europe”
. ?organization ?organization_label
. }
Moreover it is now equipped with an interface in Swedish which
demonstrates how the same knowledge base of facts stored in
English can be queried in a different language.
A pre-release of a web-based grammar editor3 for creating and compiling
GF application grammars in the cloud can be tested online. It is designed
to assist novel authors of GF grammars for instance by prompting the
writer with prefilled templates for each new concrete language. The
resulting application grammars can then be compiled directly online on
Results
MOLTO Phrasedroid
1 http://www.grammaticalframework.org/demos/phrasebook/
2 http://molto.ontotext.com
3 http://grammaticalframework.org/demos/gfse
4 MOLTO Annual Report - 2011
MOLTO generalizes Controlled Languages to
Multilingual Controlled Language Systems
supporting also ambiguities
the server to web applications in javascript. Since all the workflow happens
in the cloud, the authors do not have to install any software on their
machines, with the added advantage of accessing the latest version of the
libraries maintained by the developers.
On the more technical level, MOLTO released GF version 3.2 with simpler
installation, runtime type checker and parser for dependant types, improved
type errors reporting, probabilities in the abstract syntax, and example based
grammar generation. The grammar API is now multilingual. New languages
in the resource grammar library include Urdu, Amharic and complete
morphology for Turkish and Punjabi.
The project also released the first version of the Python plugin for GF that
makes GF primitives available from the Natural Language Toolkit, an open
source collection of Python modules for research and development in natural
language processing and text analytics, with distributions for Windows, Mac
OSX and Linux. Additionally, a Java runtine interprer for PGF grammars is
operational for parsing and linearization and used on the andoid application
of the phrasebook, the Phrasedroid. Finally, the PGF native C library, named
"libpgf", is currently still under development. Basic linearization is already
working, but both the interface and implementation require some further
refinement. The ultimate goal of the native C library is to provide a full-
fledged industrial-strength library for operating with PGF grammars, so that
there are no longer any technical limitations to prevent the adoption of GF
technology. In practice, "industrial-strength" means that the library should be
embeddable, portable, lightweight, efficient, and robust.
The first experiments with statistical engines for translation have been
presented at the MOLTO events as a first step towards the hybridization with
GF. From the GF part, some preliminary results discuss the usage of GF to
produce synthetic phrase alignments; also the methodology to extract high
quality alignments from the domain corpora is being developed. The GF
parser has been also adapted to deal robustly with this general domain
corpora. Finally, the MOLTO workshop "GF meets SMT" (Gothenburg,
November 2010) served the UPC and UGOT teams to brainstorm and
discuss on the main hybridization strategies that will be carried out in the
near future.
The MOLTO project plans to test its approach in three case studies:
mathematical exercises, museum artifacts descriptions and biomedical
patents. The mathematical case study is the most advanced test bed and
covers mathematical expressions, following OpenMath, in 10 different
languages.
5 MOLTO Annual Report - 2011
Dissemination
The MOLTO website was setup to popularize the MOLTO
technologies and to help create a community of researchers and
commercial partners. It makes available all the project’s results and
advertises the meetings and events organized by the partners.
Every presentation delivered at international meeting as well as at
internal workshops is archived on the website. The MOLTO news
updates are posted as RSS feed suitable for aggregation by
interested portals. Informal newsflash items are published using the
MOLTO Twitter feed.
The project has been presented with the travel phrasebook demo
and accompanying poster at a number of international conferences
such as EAMT2010, ACL2010, SLTC2010, and FreeRMBT2010.
MOLTO was also presented at regional meetings such at the
Jornada sobre la Indústria de la Traducció entre Llengües
Romàniques, held in September 2010 in València. Ranta delivered
a tutorial on GF at LREC 2010 and one on a specific GF application
to Multilingual Controlled Languages at CNL 2010.
Press coverage for MOLTO included, among articles in papers, also
an interview at the Bulgarian National Radio a few days after the
start of the project’s lifetime.
The MOLTO project has also become part of the META-Share
alliance of META-NET with the intent to share the linguistic
resources produced during its lifetime.
Three project meetings have been held so far, in Barcelona in
March 2010, in Varna in September 2010 and the yearly meeting in
Gothenburg in March 2011. The meetings always include an Open
Day with presentations aimed at the general audience.
Tools like Systran (Babelfish)
and Google Translate are
designed for consumers of
information, but MOLTO will
mainly target the producers of
information
MOLTO’s goal is to develop tools for web content providers to translate texts between multiple languages in real
time with high quality. Languages are separate modules in the tool and can be varied; prototypes covering a
majority of the EU’s 23 official languages will be built.
GF - Grammatical Framework
The core of a MOLTO translation system is a multilingual GF grammar
where meaning-preserving translations are obtained as composition of
parsing and generation via the abstract syntax, an interlingua. GF is a
framework for interlinguas in which the basic linguistic details of
languages, inflectional morphology and syntactic combination
functions, are provided via the Resource Grammar Library.
MOLTO will further improve grammar engineering in GF by:
! Integrated Development Environment to use the RGL and to
manage large projects;
! Example-based grammar writing support to bootstrap a grammar
from a set of example translations.
Statistical Machine Translation
MOLTO will develop and evaluate combination approaches to
integrate grammar-based and SMT models in a hybrid, robust MT
system. At least four variants will be studied:
! baseline: cascade of independent MT systems;
! hard integration: GF partial output is fixed in a regular SMT
decoding;
! soft integration I: GF partial output, as phrase pairs, is integrated
as a discriminative probability feature model in a phrase-based
SMT system;
! soft integration II: GF partial output, as tree fragment pairs, is
integrated as a discriminative probability model in a syntax-based
SMT system.
OWL Ontologies
MOLTO sees ontologies as a way to formalize interlinguas in specific
domains. Based on this observation, it will carry out research to
develop two-way grammar-ontology interoperability that will bridge
natural language and formal knowledge. The resulting MOLTO
infrastructure will allow knowledge modeling, semantic indexing and
retrieval using natural language. The engine will perform semi-
automatic creation of abstract grammars from ontologies; derive
ontologies from grammars, and retrieve instance level knowledge from/
in natural language by first transforming queries to semantic queries,
and secondly by expressing the resulting knowledge in natural
language.
molto-project.eu
twitter.com/moltoproject
M LTOOnon multa, sed multum
Multilingual On-line Translation
Tools and Resources
delivered by MOLTO
will be distributed also
through
FP7- 247914 at a glance
ICT-2009.2.2 - Language based interaction
Duration: 36 months
Start date: 1 March 2010
End date: 28 February 2013
EU funding: 2.3 Millions !
HowFar : Place -> Question;
Zoo : PlaceKind;
HowFar(Zoo)
Zoo = mkPlace (mkN "zoo" masculine) dative;
HowFar place = mkQS(mkQCl what_distance_IAdv place.name);
À quelle distance est le zoo?
Zoo = mkPlace (mkN "djurpark" "djurparker") "i";
HowFar place = mkQS(mkQCl far_IAdv(mkCl(mkVP place.to)));
Hur långt är det till djurparken?
parse linearize
http://molto-project.eu
Forthcoming
During Summer 2011, the project will release the Mathematical Grammar
Library for simple drill problems in mathematics. The GF Grammar IDE is
planned to be ready by Autumn 2011 as well as the Translation tools API and a
prototype of Grammar-Ontology Interoperability. Another prototype planned for
September 2011 is the first result on Patent Machine Translation with
Information Retrieval case study.
In terms of events organized by MOLTO, the second “GF Summer School:
Frontiers of Multilingual Technology” will be held in Barcelona in August 15-26,
2011. A tutorial on GF is planned for CADE 2011. The MOLTO partners have
also agreed to organize in June 2012 the next conference on FreeRBMT in
Gothenburg.
The third MOLTO project meeting will take place at the beginning of September
2011 in Helsinki.
Stay tuned by subscribing to the MOLTO RSS feed or follow us on Twitter.
Ontotext AD
Borislav Popov,
<borislav.popov@ontotext.com>
Goeteborgs universitet
Aarne Ranta,
<aarne@chelmers.se>
Helsingin Yliopisto
Lauri Carlson,
<lauri.carlson@helsinki.fi>
Universitat Politècnica de Catalunya
Jordi Saludes,
<jordi.saludes@upc.edu>
Contact Points

Contenu connexe

En vedette

Situación de Aprendizaje basada en la Didáctica Crítica
Situación de Aprendizaje basada en la Didáctica CríticaSituación de Aprendizaje basada en la Didáctica Crítica
Situación de Aprendizaje basada en la Didáctica CríticaADORACION RODRIGUEZ
 
Leveraging Social Media for Your Law Practice
Leveraging Social Media for Your Law PracticeLeveraging Social Media for Your Law Practice
Leveraging Social Media for Your Law PracticeLouellen Coker
 
不一樣的台灣
不一樣的台灣不一樣的台灣
不一樣的台灣honan4108
 
삼색신호등 공청회 발표자료
삼색신호등 공청회 발표자료삼색신호등 공청회 발표자료
삼색신호등 공청회 발표자료kimjint
 
Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)
Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)
Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)John Lawson
 
קייס סטאדי מהירות אתר
קייס סטאדי   מהירות אתרקייס סטאדי   מהירות אתר
קייס סטאדי מהירות אתרAssaf Cohen
 
单反相机
单反相机单反相机
单反相机Jacky Wu
 
Happily Ever After: Pain-Free Prioritization
Happily Ever After: Pain-Free PrioritizationHappily Ever After: Pain-Free Prioritization
Happily Ever After: Pain-Free PrioritizationWebVisions
 
HAPPYWEEK 184 - 2016.09.05.
HAPPYWEEK 184 - 2016.09.05.HAPPYWEEK 184 - 2016.09.05.
HAPPYWEEK 184 - 2016.09.05.Jiří Černák
 
A decentralized future – the technology of next century
A decentralized future – the technology of next centuryA decentralized future – the technology of next century
A decentralized future – the technology of next centuryBlockchain Conferences
 
New Trends in Digital Health
New Trends in Digital HealthNew Trends in Digital Health
New Trends in Digital HealthStephanie Bova
 
Startup Sorocaba: O que é o GBG Sorocaba?
Startup Sorocaba: O que é o GBG Sorocaba?Startup Sorocaba: O que é o GBG Sorocaba?
Startup Sorocaba: O que é o GBG Sorocaba?Startup Sorocaba
 
Search Engine Optimization - Even the little guys can win...
Search Engine Optimization - Even the little guys can win...Search Engine Optimization - Even the little guys can win...
Search Engine Optimization - Even the little guys can win...Greg Gifford
 

En vedette (18)

Situación de Aprendizaje basada en la Didáctica Crítica
Situación de Aprendizaje basada en la Didáctica CríticaSituación de Aprendizaje basada en la Didáctica Crítica
Situación de Aprendizaje basada en la Didáctica Crítica
 
Leveraging Social Media for Your Law Practice
Leveraging Social Media for Your Law PracticeLeveraging Social Media for Your Law Practice
Leveraging Social Media for Your Law Practice
 
Daily Newsletter: 19th May, 2011
Daily Newsletter: 19th May, 2011Daily Newsletter: 19th May, 2011
Daily Newsletter: 19th May, 2011
 
不一樣的台灣
不一樣的台灣不一樣的台灣
不一樣的台灣
 
삼색신호등 공청회 발표자료
삼색신호등 공청회 발표자료삼색신호등 공청회 발표자료
삼색신호등 공청회 발표자료
 
Blog Basics
Blog BasicsBlog Basics
Blog Basics
 
Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)
Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)
Show, Tell, Grow, Sell w/ Social Marketing (PeSA Australia Keynote)
 
קייס סטאדי מהירות אתר
קייס סטאדי   מהירות אתרקייס סטאדי   מהירות אתר
קייס סטאדי מהירות אתר
 
单反相机
单反相机单反相机
单反相机
 
Expeditie mont blanc
Expeditie mont blancExpeditie mont blanc
Expeditie mont blanc
 
Happily Ever After: Pain-Free Prioritization
Happily Ever After: Pain-Free PrioritizationHappily Ever After: Pain-Free Prioritization
Happily Ever After: Pain-Free Prioritization
 
HAPPYWEEK 184 - 2016.09.05.
HAPPYWEEK 184 - 2016.09.05.HAPPYWEEK 184 - 2016.09.05.
HAPPYWEEK 184 - 2016.09.05.
 
A decentralized future – the technology of next century
A decentralized future – the technology of next centuryA decentralized future – the technology of next century
A decentralized future – the technology of next century
 
New Trends in Digital Health
New Trends in Digital HealthNew Trends in Digital Health
New Trends in Digital Health
 
DECENTRALIZED REVOLUTION. AMIN RAFIEE
DECENTRALIZED REVOLUTION. AMIN RAFIEEDECENTRALIZED REVOLUTION. AMIN RAFIEE
DECENTRALIZED REVOLUTION. AMIN RAFIEE
 
Startup Sorocaba: O que é o GBG Sorocaba?
Startup Sorocaba: O que é o GBG Sorocaba?Startup Sorocaba: O que é o GBG Sorocaba?
Startup Sorocaba: O que é o GBG Sorocaba?
 
How To Be Happy During Pregnancy?
How To Be Happy During Pregnancy?How To Be Happy During Pregnancy?
How To Be Happy During Pregnancy?
 
Search Engine Optimization - Even the little guys can win...
Search Engine Optimization - Even the little guys can win...Search Engine Optimization - Even the little guys can win...
Search Engine Optimization - Even the little guys can win...
 

Similaire à MOLTO Annual Report 2011

MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.Olga Caprotti
 
Language translator
Language translatorLanguage translator
Language translatorSumitSumit26
 
Java-centered Translator-based Multi-paradigm Software Development Environment
Java-centered Translator-based Multi-paradigm Software Development EnvironmentJava-centered Translator-based Multi-paradigm Software Development Environment
Java-centered Translator-based Multi-paradigm Software Development EnvironmentWaqas Tariq
 
Is Python a Programming language or Scripting Language_.pdf
Is Python a Programming language or Scripting Language_.pdfIs Python a Programming language or Scripting Language_.pdf
Is Python a Programming language or Scripting Language_.pdfKajal Digital
 
Is Python a Programming language or Scripting Language.pdf
Is Python a Programming language or Scripting Language.pdfIs Python a Programming language or Scripting Language.pdf
Is Python a Programming language or Scripting Language.pdfKajal Digital
 
languagetranslator-211028085026.pptx
languagetranslator-211028085026.pptxlanguagetranslator-211028085026.pptx
languagetranslator-211028085026.pptxMDASIFALI32
 
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...IJwest
 
Combining the functional and oject oriented paradigms in the fobs-x scripting...
Combining the functional and oject oriented paradigms in the fobs-x scripting...Combining the functional and oject oriented paradigms in the fobs-x scripting...
Combining the functional and oject oriented paradigms in the fobs-x scripting...ijpla
 
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND...
A FRAMEWORK FOR BUILDING A MULTILINGUAL  INDUSTRIAL ONTOLOGY: METHODOLOGY AND...A FRAMEWORK FOR BUILDING A MULTILINGUAL  INDUSTRIAL ONTOLOGY: METHODOLOGY AND...
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND...dannyijwest
 
Lexigraf - a multilingual lexicography DTP engine
Lexigraf - a multilingual lexicography DTP engineLexigraf - a multilingual lexicography DTP engine
Lexigraf - a multilingual lexicography DTP engineYiannis Hatzopoulos
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsSpringer
 
Is Flutter Future-ready for E-Learning Applications.pdf
Is Flutter Future-ready for E-Learning Applications.pdfIs Flutter Future-ready for E-Learning Applications.pdf
Is Flutter Future-ready for E-Learning Applications.pdfQSS Technosoft Inc.
 
CSCorganization of programming languages
CSCorganization of programming languagesCSCorganization of programming languages
CSCorganization of programming languagesOluwafolakeOjo
 

Similaire à MOLTO Annual Report 2011 (20)

MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.
 
Intermediate Languages
Intermediate LanguagesIntermediate Languages
Intermediate Languages
 
Language translator
Language translatorLanguage translator
Language translator
 
Java-centered Translator-based Multi-paradigm Software Development Environment
Java-centered Translator-based Multi-paradigm Software Development EnvironmentJava-centered Translator-based Multi-paradigm Software Development Environment
Java-centered Translator-based Multi-paradigm Software Development Environment
 
Is Python a Programming language or Scripting Language_.pdf
Is Python a Programming language or Scripting Language_.pdfIs Python a Programming language or Scripting Language_.pdf
Is Python a Programming language or Scripting Language_.pdf
 
Is Python a Programming language or Scripting Language.pdf
Is Python a Programming language or Scripting Language.pdfIs Python a Programming language or Scripting Language.pdf
Is Python a Programming language or Scripting Language.pdf
 
languagetranslator-211028085026.pptx
languagetranslator-211028085026.pptxlanguagetranslator-211028085026.pptx
languagetranslator-211028085026.pptx
 
Go Vs Python.pdf
Go Vs Python.pdfGo Vs Python.pdf
Go Vs Python.pdf
 
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...
 
Combining the functional and oject oriented paradigms in the fobs-x scripting...
Combining the functional and oject oriented paradigms in the fobs-x scripting...Combining the functional and oject oriented paradigms in the fobs-x scripting...
Combining the functional and oject oriented paradigms in the fobs-x scripting...
 
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND...
A FRAMEWORK FOR BUILDING A MULTILINGUAL  INDUSTRIAL ONTOLOGY: METHODOLOGY AND...A FRAMEWORK FOR BUILDING A MULTILINGUAL  INDUSTRIAL ONTOLOGY: METHODOLOGY AND...
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND...
 
Lexigraf - a multilingual lexicography DTP engine
Lexigraf - a multilingual lexicography DTP engineLexigraf - a multilingual lexicography DTP engine
Lexigraf - a multilingual lexicography DTP engine
 
Lit mtap
Lit mtapLit mtap
Lit mtap
 
An Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile EnvironmentAn Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile Environment
 
[EN] "Multilingual Information and Retrieval Systems Technology and Applicati...
[EN] "Multilingual Information and Retrieval Systems Technology and Applicati...[EN] "Multilingual Information and Retrieval Systems Technology and Applicati...
[EN] "Multilingual Information and Retrieval Systems Technology and Applicati...
 
[EN] Multilingual Information and Retrieval Systems, Technology and Applicati...
[EN] Multilingual Information and Retrieval Systems, Technology and Applicati...[EN] Multilingual Information and Retrieval Systems, Technology and Applicati...
[EN] Multilingual Information and Retrieval Systems, Technology and Applicati...
 
D1803041822
D1803041822D1803041822
D1803041822
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Is Flutter Future-ready for E-Learning Applications.pdf
Is Flutter Future-ready for E-Learning Applications.pdfIs Flutter Future-ready for E-Learning Applications.pdf
Is Flutter Future-ready for E-Learning Applications.pdf
 
CSCorganization of programming languages
CSCorganization of programming languagesCSCorganization of programming languages
CSCorganization of programming languages
 

Plus de Olga Caprotti

Charting the learning tracks
Charting the learning tracksCharting the learning tracks
Charting the learning tracksOlga Caprotti
 
Treasure hunts and educational games
Treasure hunts and educational gamesTreasure hunts and educational games
Treasure hunts and educational gamesOlga Caprotti
 
Trusted data and services from the GDML
Trusted data and services from the GDMLTrusted data and services from the GDML
Trusted data and services from the GDMLOlga Caprotti
 
Trusted data and services from the GDML
Trusted data and services from the GDMLTrusted data and services from the GDML
Trusted data and services from the GDMLOlga Caprotti
 
Sunburst diagrams for Calculus II Logpaths
Sunburst diagrams for Calculus II LogpathsSunburst diagrams for Calculus II Logpaths
Sunburst diagrams for Calculus II LogpathsOlga Caprotti
 
Big Data in Education
Big Data in EducationBig Data in Education
Big Data in EducationOlga Caprotti
 
CEIC presentation of the IMU at CoData 2012
CEIC presentation of the IMU at CoData 2012CEIC presentation of the IMU at CoData 2012
CEIC presentation of the IMU at CoData 2012Olga Caprotti
 
MOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala SwedenMOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala SwedenOlga Caprotti
 
Configuring VLEs For Mathematics
Configuring VLEs For MathematicsConfiguring VLEs For Mathematics
Configuring VLEs For MathematicsOlga Caprotti
 
How to make mathematical eContent travel well
How to make mathematical eContent travel wellHow to make mathematical eContent travel well
How to make mathematical eContent travel wellOlga Caprotti
 
JEM: a network for mathematics educators
JEM: a network for mathematics educatorsJEM: a network for mathematics educators
JEM: a network for mathematics educatorsOlga Caprotti
 
JEM Repository for Learning Objects
JEM Repository for Learning ObjectsJEM Repository for Learning Objects
JEM Repository for Learning ObjectsOlga Caprotti
 
Mathematics Education in Second Life
Mathematics Education in Second LifeMathematics Education in Second Life
Mathematics Education in Second LifeOlga Caprotti
 
Joining Educational Mathematics
Joining Educational MathematicsJoining Educational Mathematics
Joining Educational MathematicsOlga Caprotti
 
Advanced Language Technologies for Mathematical Markup
Advanced Language Technologies for Mathematical MarkupAdvanced Language Technologies for Mathematical Markup
Advanced Language Technologies for Mathematical MarkupOlga Caprotti
 
Multilingual Mathematics in WebALT
Multilingual Mathematics in WebALTMultilingual Mathematics in WebALT
Multilingual Mathematics in WebALTOlga Caprotti
 
Free Knowledge for free minds scenario
Free Knowledge for free minds scenarioFree Knowledge for free minds scenario
Free Knowledge for free minds scenarioOlga Caprotti
 

Plus de Olga Caprotti (18)

Charting the learning tracks
Charting the learning tracksCharting the learning tracks
Charting the learning tracks
 
Treasure hunts and educational games
Treasure hunts and educational gamesTreasure hunts and educational games
Treasure hunts and educational games
 
Trusted data and services from the GDML
Trusted data and services from the GDMLTrusted data and services from the GDML
Trusted data and services from the GDML
 
Trusted data and services from the GDML
Trusted data and services from the GDMLTrusted data and services from the GDML
Trusted data and services from the GDML
 
Sunburst diagrams for Calculus II Logpaths
Sunburst diagrams for Calculus II LogpathsSunburst diagrams for Calculus II Logpaths
Sunburst diagrams for Calculus II Logpaths
 
Big Data in Education
Big Data in EducationBig Data in Education
Big Data in Education
 
CEIC presentation of the IMU at CoData 2012
CEIC presentation of the IMU at CoData 2012CEIC presentation of the IMU at CoData 2012
CEIC presentation of the IMU at CoData 2012
 
MOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala SwedenMOLTO poster for ACL 2010, Uppsala Sweden
MOLTO poster for ACL 2010, Uppsala Sweden
 
Configuring VLEs For Mathematics
Configuring VLEs For MathematicsConfiguring VLEs For Mathematics
Configuring VLEs For Mathematics
 
Three years of JEM
Three years of JEMThree years of JEM
Three years of JEM
 
How to make mathematical eContent travel well
How to make mathematical eContent travel wellHow to make mathematical eContent travel well
How to make mathematical eContent travel well
 
JEM: a network for mathematics educators
JEM: a network for mathematics educatorsJEM: a network for mathematics educators
JEM: a network for mathematics educators
 
JEM Repository for Learning Objects
JEM Repository for Learning ObjectsJEM Repository for Learning Objects
JEM Repository for Learning Objects
 
Mathematics Education in Second Life
Mathematics Education in Second LifeMathematics Education in Second Life
Mathematics Education in Second Life
 
Joining Educational Mathematics
Joining Educational MathematicsJoining Educational Mathematics
Joining Educational Mathematics
 
Advanced Language Technologies for Mathematical Markup
Advanced Language Technologies for Mathematical MarkupAdvanced Language Technologies for Mathematical Markup
Advanced Language Technologies for Mathematical Markup
 
Multilingual Mathematics in WebALT
Multilingual Mathematics in WebALTMultilingual Mathematics in WebALT
Multilingual Mathematics in WebALT
 
Free Knowledge for free minds scenario
Free Knowledge for free minds scenarioFree Knowledge for free minds scenario
Free Knowledge for free minds scenario
 

Dernier

Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsZilliz
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 

Dernier (20)

Fact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMsFact vs. Fiction: Autodetecting Hallucinations in LLMs
Fact vs. Fiction: Autodetecting Hallucinations in LLMs
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 

MOLTO Annual Report 2011

  • 1. Multilingual Online Translation Annual report 2010 - 2011 MOLTO’s goal is to develop a set of tools for translating texts between multiple languages in real time with high quality. MOLTO uses domain- specific semantic grammars and ontology-based interlinguas implemented in GF (Grammatical Framework), a grammar formalism where multiple languages are related by a common abstract syntax. GF has been applied in several small-to-medium size domains, typically targeting up to ten languages but MOLTO will scale this up in terms of productivity and applicability by increasing the size of domains and the number of languages. MOLTO aims to make the technology accessible for domain experts without GF expertise and to reduce the effort needed for building a translator to just extending a lexicon and writing a set of example sentences. The most research-intensive parts of MOLTO are the two-way interoperability between ontology standards (OWL) and GF grammars, and the extension of rule-based translation by statistical methods. The OWL-GF interoperability will enable multilingual natural-language-based interaction with machine-readable knowledge while the statistical methods will add robustness to the system when desired. MOLTO technology will be released as open-source libraries for standard translation tools and web pages and thereby fit into standard workflows. Non multa, sed multum Contents The project MOLTO - Multilingual Online Translation, started on March 1, 2010 and will run for 36 months. 2 A few results have been already achieved during the first year of the project’s lifetime. 3 The MOLTO website, at http://molto- project.eu, was setup to popularize the MOLTO technologies and to help create a MOLTO community of researchers and commercial partners 5 During Summer 2011, the project will release the Mathematical Grammar Library for simple drill problems in mathematics 6
  • 2. 2 MOLTO Annual Report - 2011 “Vivamus porta est sed est.” MOLTO's mission is to develop a set of tools for translating texts between multiple languages in real time with high quality. MOLTO will use multilingual grammars based on semantic interlinguas. Project Objective The project MOLTO - Multilingual Online Translation, started on March 1, 2010 and will run for 36 months. It promises to develop a set of tools for translating texts between multiple languages in real time with high quality. MOLTO will use multilingual grammars based on semantic interlinguas and statistical machine translation to simplify the production of multilingual documents without sacrificing the quality. The interlinguas are based on domain semantics and are equipped with reversible generation functions: namely translation is obtained as a composition of parsing the source language and generating the target language. An implementation of this technology is provided by GF [2], Grammatical Framework. GF technologies in MOLTO are complemented by the use of ontologies, such as used in the semantic web, and by methods of statistical machine translation (SMT) for improving robustness and extracting grammars from data. Grammatical Framework. GF technologies in MOLTO are complemented by the use of ontologies, such as used in the semantic web, and by methods of statistical machine translation (SMT) for improving robustness and extracting grammars from data. MOLTO is committed to dealing with 15 languages, which includes 12 official languages of the European Union - Bulgarian, Danish, Dutch, English, Finnish, French, German, Italian, Polish, Romanian, Spanish, and Swedish - and 3 other languages - Catalan, Norwegian, and Russian. In addition, there is on-going work on at least Arabic, Farsi, Hebrew, Hindi/Urdu, Icelandic, Japanese, Latvian, Maltese, Portuguese, Swahili, Tswana, and Turkish. Tools like Systran (Babelfish) and Google Translate are designed for consumers of information, but MOLTO will mainly target the producers of information. Hence, the quality of the MOLTO translations must be good enough for, say, an e-commerce site to use in translating their web pages automatically without the fear that the message will change. Third-party translation tools, possibly integrated in the browsers, let potential customers discover, in their preferred language, whether, for instance, an e-commerce page written in French offers something of interest. Customers understand that these translations are approximate and will filter out imprecision. If, for instance, the system has translated a price of 100 Euros to 100 Swedish Crowns (which equals 10 Euros), they will not insist to buy the product for that price. But if a company had placed such a translation on its website, then it might be committed to it. There is a well-known trade-off in machine translation: one cannot at the same time reach full coverage and full precision. In this trade-off, Systran and Google have opted for coverage whereas MOLTO opts for precision in domains with a well-understood language. Three such domains will be considered during the MOLTO project: mathematical exercises, biomedical patents, and museum object descriptions. The MOLTO tools however will be applicable to other domains as well. Examples of such domains could be e-commerce sites, Wikipedia articles, contracts, business letters, user manuals, and software localization.
  • 3. 3 MOLTO Annual Report - 2011 The expected final product of MOLTO is an open-source software toolkit consisting in: • grammar development tool, as an IDE and an API, to allow use as a plug-in to web browsers, translation tools, etc, for easy construction and improvement of translation systems and the integration of ontologies with grammars • translator’s tool, as an API and some interfaces in web browsers and translation tools • grammar libraries for linguistic resources, and for the domains of patents, mathematics, and cultural heritage A few results have been already achieved during the first year of the project’s lifetime. Two applications of the MOLTO translation web services are online on the project web pages: 1. The travel phrasebook1 translates sentences to 14 different languages and shows some of the major end-user features available to MOLTO users: predictive typing and JavaScript- based GUI. Predictive typing prompts the user with the next available choices mandated by the underlying grammar and offers quasi-incremental translations of intermediate results from words or complete sentences. JavaScript-based GUI using off- the-shelf functions can be readily deployed on any device where a browser is available. 2. The MOLTO KRI2, Knowledge Reasoning Infrastructure, demonstrates the possibility of adding a natural language query language to retrieve answers from an OWL database. In this way, a query like Give me information about all organizations located in Europe is interpreted as the machine understandable SPARQL statement: SELECT DISTINCT ?organization ?organization_label WHERE { ?organization . ?organization ?organizationloc. ?organizationloc “Europe” . ?organization ?organization_label . } Moreover it is now equipped with an interface in Swedish which demonstrates how the same knowledge base of facts stored in English can be queried in a different language. A pre-release of a web-based grammar editor3 for creating and compiling GF application grammars in the cloud can be tested online. It is designed to assist novel authors of GF grammars for instance by prompting the writer with prefilled templates for each new concrete language. The resulting application grammars can then be compiled directly online on Results MOLTO Phrasedroid 1 http://www.grammaticalframework.org/demos/phrasebook/ 2 http://molto.ontotext.com 3 http://grammaticalframework.org/demos/gfse
  • 4. 4 MOLTO Annual Report - 2011 MOLTO generalizes Controlled Languages to Multilingual Controlled Language Systems supporting also ambiguities the server to web applications in javascript. Since all the workflow happens in the cloud, the authors do not have to install any software on their machines, with the added advantage of accessing the latest version of the libraries maintained by the developers. On the more technical level, MOLTO released GF version 3.2 with simpler installation, runtime type checker and parser for dependant types, improved type errors reporting, probabilities in the abstract syntax, and example based grammar generation. The grammar API is now multilingual. New languages in the resource grammar library include Urdu, Amharic and complete morphology for Turkish and Punjabi. The project also released the first version of the Python plugin for GF that makes GF primitives available from the Natural Language Toolkit, an open source collection of Python modules for research and development in natural language processing and text analytics, with distributions for Windows, Mac OSX and Linux. Additionally, a Java runtine interprer for PGF grammars is operational for parsing and linearization and used on the andoid application of the phrasebook, the Phrasedroid. Finally, the PGF native C library, named "libpgf", is currently still under development. Basic linearization is already working, but both the interface and implementation require some further refinement. The ultimate goal of the native C library is to provide a full- fledged industrial-strength library for operating with PGF grammars, so that there are no longer any technical limitations to prevent the adoption of GF technology. In practice, "industrial-strength" means that the library should be embeddable, portable, lightweight, efficient, and robust. The first experiments with statistical engines for translation have been presented at the MOLTO events as a first step towards the hybridization with GF. From the GF part, some preliminary results discuss the usage of GF to produce synthetic phrase alignments; also the methodology to extract high quality alignments from the domain corpora is being developed. The GF parser has been also adapted to deal robustly with this general domain corpora. Finally, the MOLTO workshop "GF meets SMT" (Gothenburg, November 2010) served the UPC and UGOT teams to brainstorm and discuss on the main hybridization strategies that will be carried out in the near future. The MOLTO project plans to test its approach in three case studies: mathematical exercises, museum artifacts descriptions and biomedical patents. The mathematical case study is the most advanced test bed and covers mathematical expressions, following OpenMath, in 10 different languages.
  • 5. 5 MOLTO Annual Report - 2011 Dissemination The MOLTO website was setup to popularize the MOLTO technologies and to help create a community of researchers and commercial partners. It makes available all the project’s results and advertises the meetings and events organized by the partners. Every presentation delivered at international meeting as well as at internal workshops is archived on the website. The MOLTO news updates are posted as RSS feed suitable for aggregation by interested portals. Informal newsflash items are published using the MOLTO Twitter feed. The project has been presented with the travel phrasebook demo and accompanying poster at a number of international conferences such as EAMT2010, ACL2010, SLTC2010, and FreeRMBT2010. MOLTO was also presented at regional meetings such at the Jornada sobre la Indústria de la Traducció entre Llengües Romàniques, held in September 2010 in València. Ranta delivered a tutorial on GF at LREC 2010 and one on a specific GF application to Multilingual Controlled Languages at CNL 2010. Press coverage for MOLTO included, among articles in papers, also an interview at the Bulgarian National Radio a few days after the start of the project’s lifetime. The MOLTO project has also become part of the META-Share alliance of META-NET with the intent to share the linguistic resources produced during its lifetime. Three project meetings have been held so far, in Barcelona in March 2010, in Varna in September 2010 and the yearly meeting in Gothenburg in March 2011. The meetings always include an Open Day with presentations aimed at the general audience. Tools like Systran (Babelfish) and Google Translate are designed for consumers of information, but MOLTO will mainly target the producers of information MOLTO’s goal is to develop tools for web content providers to translate texts between multiple languages in real time with high quality. Languages are separate modules in the tool and can be varied; prototypes covering a majority of the EU’s 23 official languages will be built. GF - Grammatical Framework The core of a MOLTO translation system is a multilingual GF grammar where meaning-preserving translations are obtained as composition of parsing and generation via the abstract syntax, an interlingua. GF is a framework for interlinguas in which the basic linguistic details of languages, inflectional morphology and syntactic combination functions, are provided via the Resource Grammar Library. MOLTO will further improve grammar engineering in GF by: ! Integrated Development Environment to use the RGL and to manage large projects; ! Example-based grammar writing support to bootstrap a grammar from a set of example translations. Statistical Machine Translation MOLTO will develop and evaluate combination approaches to integrate grammar-based and SMT models in a hybrid, robust MT system. At least four variants will be studied: ! baseline: cascade of independent MT systems; ! hard integration: GF partial output is fixed in a regular SMT decoding; ! soft integration I: GF partial output, as phrase pairs, is integrated as a discriminative probability feature model in a phrase-based SMT system; ! soft integration II: GF partial output, as tree fragment pairs, is integrated as a discriminative probability model in a syntax-based SMT system. OWL Ontologies MOLTO sees ontologies as a way to formalize interlinguas in specific domains. Based on this observation, it will carry out research to develop two-way grammar-ontology interoperability that will bridge natural language and formal knowledge. The resulting MOLTO infrastructure will allow knowledge modeling, semantic indexing and retrieval using natural language. The engine will perform semi- automatic creation of abstract grammars from ontologies; derive ontologies from grammars, and retrieve instance level knowledge from/ in natural language by first transforming queries to semantic queries, and secondly by expressing the resulting knowledge in natural language. molto-project.eu twitter.com/moltoproject M LTOOnon multa, sed multum Multilingual On-line Translation Tools and Resources delivered by MOLTO will be distributed also through FP7- 247914 at a glance ICT-2009.2.2 - Language based interaction Duration: 36 months Start date: 1 March 2010 End date: 28 February 2013 EU funding: 2.3 Millions ! HowFar : Place -> Question; Zoo : PlaceKind; HowFar(Zoo) Zoo = mkPlace (mkN "zoo" masculine) dative; HowFar place = mkQS(mkQCl what_distance_IAdv place.name); À quelle distance est le zoo? Zoo = mkPlace (mkN "djurpark" "djurparker") "i"; HowFar place = mkQS(mkQCl far_IAdv(mkCl(mkVP place.to))); Hur långt är det till djurparken? parse linearize http://molto-project.eu
  • 6. Forthcoming During Summer 2011, the project will release the Mathematical Grammar Library for simple drill problems in mathematics. The GF Grammar IDE is planned to be ready by Autumn 2011 as well as the Translation tools API and a prototype of Grammar-Ontology Interoperability. Another prototype planned for September 2011 is the first result on Patent Machine Translation with Information Retrieval case study. In terms of events organized by MOLTO, the second “GF Summer School: Frontiers of Multilingual Technology” will be held in Barcelona in August 15-26, 2011. A tutorial on GF is planned for CADE 2011. The MOLTO partners have also agreed to organize in June 2012 the next conference on FreeRBMT in Gothenburg. The third MOLTO project meeting will take place at the beginning of September 2011 in Helsinki. Stay tuned by subscribing to the MOLTO RSS feed or follow us on Twitter. Ontotext AD Borislav Popov, <borislav.popov@ontotext.com> Goeteborgs universitet Aarne Ranta, <aarne@chelmers.se> Helsingin Yliopisto Lauri Carlson, <lauri.carlson@helsinki.fi> Universitat Politècnica de Catalunya Jordi Saludes, <jordi.saludes@upc.edu> Contact Points