Georg Rehm. Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market. Future and Emerging Trends in Language Technologies, Machine Learning and Big Data (FETLT 2016), Seville, Spain, November 2016. November 30, 2016.
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Multilingual Europe in late 2016
1. META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER
(grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119),
CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899).
Multilingual Europe in late 2016
A Strategic Research and Innovation Agenda
for the Multilingual Digital Single Market
Georg Rehm
Coordinator CRACKER, General Secretary META-NET
DFKI, Germany
georg.rehm@dfki.de
FETLT 2016 2nd International Workshop – Seville, Spain, 30th November 2016
2. Outline
q Initiatives for Multilingual Europe
q Towards the Multilingual Digital Single Market
q The MDSM SRIA V0.9
q Multilingual Europe in late 2016 –
where do we stand?
http://www.meta-net.eu – http://www.cracker-project.eu 2
3. q
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde)
General Secretary: Georg Rehm (DFKI)
q
Multilingual Europe
Technology Alliance.
826 members in
67 countries
(published in 2013) (31 volumes; published in 2012)
T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
7. Strategic Research Agenda (2013)
q Addresses the problems we identified
when preparing the white papers.
q Can put Europe ahead of its
competitors in this technology area.
q 200 contributors; >2 years.
54% industry; 46% research;
4% (inter)national institutions.
q Presented and discussed at 90+
conferences and major workshops.
q Published in early 2013.
q http://www.meta-net.eu/sra
http://www.meta-net.eu 7
8. Priority Research Themes
q Three priority research themes:
§ Translingual Cloud
§ Social Intelligence and
e-Participation
§ Socially-Aware Interactive
Assistants
q Two additional themes:
§ European Service Platform
for Language Technologies
§ Core Technologies for
Language Analysis and Production
http://www.meta-net.eu 8
9. 1 DFKI Germany Georg Rehm
2 CUNI Czech Republic Jan Hajic
3 ELDA France Khalid Choukri
4 FBK Italy Marcello Federico
5 ATHENA RC Greece Stelios Piperidis
6 UEDIN UK Philipp Koehn
7 USFD UK Lucia Specia
Coordination and Support Action, H2020-ICT17, 2015–2017, 36 months – http://www.cracker-project.eu
Cracking the Language Barrier
Coordination, Evaluation and Resources for European MT Research
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
Language-blocking:
languages they do not speak
Geo-blocking and language-blocking are barriers to access
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
customers
Language-blocking:
languages they do not speak
however, current online translation is insufficient
trying to conduct
common languages
Geo-blocking and language-blocking are barriers to access
Both geo-blocking and language-blocking are
daily problems for tens of millions of EU citizens.
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
Communities
• META-NET incl. META-SHARE and META
• MT evaluation initiatives – WMT, IWSLT, MT Marathons
• MT and other LT industry
• Language resources – META-SHARE, ELRA
• HT/MT evaluation tools – translate5
• Translation industry, translation profession
• MT user communities
Strategic Agenda for the Multilingual Digital Single Market
• Version 0.5 presented at META-FORUM 2015 (Riga)
• Version 0.9 presented at META-FORUM 2016 (Lisbon)
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
10. Selected Activities
2015 2016 2017
M12
M1
M24
M36
Kick-off meeting
for all ICT-17
Projects
translate5
WMT
2016
WMT
2017
IWSLT
2015
IWSLT
2016
IWSLT
2017
QT Marathon
2015
QT Marathon
2016
Roadmap for
European MT
Research
Survey on the State
of HQMT in Industry
and LSPs
SRIA
(initial version)
SRIA
(update)
SRIA
(final)
version 2version 1
• Production of resources (e.g., for WMT
2016 and 2017, IWSLT 2015-2017)
• Tools (quality control, evaluations)
• Strategies and roadmaps (SRIA,
Roadmap for European MT Research)
• Exchange and sharing facility for
resources (META-SHARE)
Recent or Upcoming Events
• LREC Workshop on MT Eval. (May 25)
• META-FORUM 2016 (July 4/5, Lisbon)
• WMT 2016 (Aug. 11/12, Berlin)
• IWSLT 2016 (Dec. 8/9, Seattle)
• Federation of organisations and
projects working on technologies
for multilingual Europe.
• 10 organisations; 24 projects.
• Areas of collaboration: data
management and repositories,
tools, shared tasks, evaluations.
• Goal: provide one umbrella
organisation for the whole
community.
http://www.cracking-the-language-barrier.eu
11. http://www.cracker-project.eu • http://www.meta-net.eu
• Riga Summit 2015 and Riga Declaration.
• Federation of European projects and
organisations working on technologies
for a multilingual Europe.
• Multi-lateral Memorandum of Understanding;
10 organisations and 24 projects on board.
• Getting new members on a regular basis.
• Selected areas of collaboration: data
management and repositories, tools,
shared tasks, evaluations, events.
• Goal: provide one umbrella organisation
for the whole community.
12. q Top priority in the European Union.
q Expected to add 400b€ to European GDP
and hundreds of thousands of new jobs.
q Unfortunately, the language topic is not
included in the EC’s Digital Single Market
strategy (published in May 2015).
13.
14.
15. A. Ansip’s May 2016 Blog Post
q Posted on 27 May 2016.
q First public acknowledgment
of the EC that the language
topic is of very high relevance
for the Digital Single Market.
q “Overcoming language
barriers is vital for building the
DSM, which is by definition
multilingual. It is now time to
reduce and remove the
language barriers that are
holding back its advance, and
turn them into competitive
advantages.”
http://www.meta-net.eu – http://www.cracker-project.eu 15
17. MDSM SRIA
q Version 0.5 unveiled at META-FORUM 2015
q Version 0.9 unveiled at META-FORUM 2016
q Version 1.0 foreseen for early 2017
q Prepared and presented by Cracking the Language
Barrier federation (editorial team: 13 colleagues)
q SRIA addresses how the LT community is going
to act united in order to make the DSM multilingual
q Aligned to three of the BDVA SRIA V2.0’s technical priorities:
Data Management, Data Analysis, Data Processing.
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
Strategic Agenda for the
Multilingual Digital Single Market
Technologies for Overcoming Language Barriers towards
a truly integrated European Online Market
D
RAFT
Version 0.5 – April 22, 2015
http://www.meta-net.eu – http://www.cracker-project.eu 17
18. Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
http://www.cracker-project.eu
http://www.cracking-the-language-barrier.eu
19. MDSM: Goals and Needs
q Crosslingual communication for SMEs, public institutions, citizens
q Crosslingual SME presales communication and aftersales services
q Multilingual (big) data, language and knowledge value chains
q Multilingual websites, product catalogues, product descriptions
q Multilingual knowledge bases and knowledge graphs (and services)
q Multilingual conversational interfaces for connected devices (IoT)
q Crosslingual business intelligence (e.g., based on UGC)
q Crosslingual social media analytics for EU-wide societal issues
q Multilingual text and report generation (knowledge/data to text)
q All services must be domain-adaptable (no one size fits all)
q Translation Centre (Cloud) – HQ automated translation for all
http://www.meta-net.eu – http://www.cracker-project.eu 19
20. MLV Programme
q Multilingual Value Programe*
§ Three-year programme
§ Requires modest investment
q “Enabling the Multilingual Digital Single
Market through technologies for
translating, analysing, processing and
curating natural language content”
q Three components address the main
needs of the Multilingual DSM (MDSM)
and how to put them into practice:
1. Multilingual Application Areas
2. Multilingual Services
3. Research
http://www.meta-net.eu – http://www.cracker-project.eu 20
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
* SRIA V0.9 and MLV Programme devised
before re-organisation of DG CONNECT.
21. Multilingual Digital Single Market
Automated Translation
E-Commerce
Content, Media,
Verticals
Translation, Language,
Knowledge, Data
Knowledge and
Data Repositories
Multilingual Applications
Multilingual Services
Research
Crosslingual Big
Data Language
Analytics
Meaning,
Semantics,
Knowledge
High-Quality
Machine
Translation
SMEs CEF DSIs IT Integrators Research
provide innovative
applications
fills gaps
H2020 RIAs
H2020 CSAs, IAs, RIAs
H2020 CSAs, RAs, national funding
Multimodal Interaction
Language Processing, Analysis and Production – Language Resources
Citizens Public Business
interoperable and standardised
collaboration with member states
Conversational
Technologies
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
MLV Programme
22. Application Areas (Selection)
q Multilingual E-commerce
§ Customer-facing vs. back-office facing (after-market, after-sales)
§ Crosslingual search, CRM, helpdesks, processes, workflows
§ Semantic, crosslingual product descriptions and catalogues
§ Online dispute resolution
q Multilingual Content, Media, Verticals
§ Content analytics, curation, generation (incl. authoring support)
§ Multimodal communication (conversational, written, IoT)
§ Vertical domains: health, government, mobility, energy, legal.
q Translation, Language, Knowledge, Data
§ Translation Cloud – written/spoken, automatic/human
§ Crosslingual public and social intelligence, business intelligence
§ HQ resources, under-resourced languages, domain-specific LRs
22
23. Setup – Timeframe – Costs
q Close collaboration with EC, EP and all other stakeholders
(including SMEs, research centres, universities, NGOs etc.).
q Mix of funding sources:
§ Horizon 2020 (WP 2018-2020) for EU projects (RA, RIA, CSA)
§ National/regional funding sources for work on monolingual LTs
and LRs and also to support and grow SMEs in this area
§ Include, strengthen and broaden role of CEF AT (public services)
q Estimated costs for basic MLV implementation: ca. 175-200M€
§ Includes set of mission-critical services and applications
§ Timeframe: 2018, 2019, 2020
http://www.meta-net.eu – http://www.cracker-project.eu 23
24. q Multilingual Europe: danger of digital language extinction; all languages
are equal; multilingual DSM; world class LT research in Europe.
q Artificial Intelligence: Important breakthroughs and massive investments
(USA, Asia) in AI R&D and applications (deep learning, DNNs).
q Need for LT: not only Multilingual DSM but also Translation, Internet of
Things, Industrie 4.0, HCI, smart personal assistants etc.
q Need for European LT: US and other non-European technologies are not
the solution! Europe must not make its crucial IT infrastructure dependent
on non-European solutions (same reason why EU is building GALILEO).
q Digitalisation of our continent: SMEs, enterprises, public administrations
are struggling to cope with the digital revolution (see Industrie 4.0, IoT etc.).
q Security and Privacy: Secure systems on European servers are essential
for large-scale industry adoption.
q Growing need for Language Technologies made in Europe for Europe.
http://www.meta-net.eu – http://www.cracker-project.eu 24
Context – Current Developments
25. Multilingual
Europe
through
Technology
Current Initiatives
and Activities
Multilingual
Strategy of the
EU: more tech
support for
multilingualism
Language
Technologies
for Europe's
digital public
services
Technologies
for the
Multilingual
Digital Single
Market
Language
Technologies
for Big Data
text analytics
The Human
Language
Project – long-
term R&D&I,
post-H2020
Language
Technologies
R&D&I
(H2020, WP
2018-20)
Multilingual Europe
in late 2016
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
Open calls and
upcoming service
contracts
Dec. 2016: EC brainstorming
meeting on future LT priorities
In and post Horizon 2020.
Maybe a new document is needed?
Jan. 2017: STOA workshop
and study on LT for Europe
Dec. 2017: LT Session
at BDVA Summit in
Valencia
Q1 2017: MDSM SRIA V1.0
Policy change and initiative
towards a European digital
public sphere enabled by MT/LT
DG CONNECT
DGT and
DG CONNECT
DG CONNECT
WP 2018-20 (incl. IoT,
I4.0, assistants,
robots etc.)
Shared programme
between EU and MS MLV Programme
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
CEF AT
ELRC
26. Thank you for your attention.
georg.rehm@dfki.de
http://www.meta-net.eu
http://www.facebook.com/META.Alliance
http://www.cracker-project.eu
http://www.cracking-the-language-barrier.eu
26