Johannes Heinecke - 2017 - Multi-Model and Crosslingual Dependency Analysis

•

0 j'aime•103 vues

Association for Computational Linguistics

This document summarizes the CoNLL 2017 UD Shared Task on multi-model and crosslingual dependency parsing. It describes the BistParser modifications used, including training options like hidden layer size and word embeddings. Two crosslingual approaches were tested: training on a mix of 23 languages and training on a typologically close language. The second approach gave better results. The document reports the system's position in the shared task, achieving 10th place overall and 9th on content word labeling, as well as results on surprise languages.

Formation

CoNLL 2017 UD Shared Task
Mul -Model and Crosslingual
Dependency Analysis
Johannes Heinecke, Munshi Asadullah
Orange Labs, Lannion, France
Architecture
BistParser modiﬁca ons: One depencency tree per sentence
Training:
• hidden layer size 40, 50 or 100, depending on language
• other BistParser op ons used: --k 3 --lstmdims 125
--lstmlayers 2 --bibi-lstm --usehead --userl
• word embeddings for all languages (except Gothic)
– all words in lowercase (if applicable)
– punctua on separated from words
– word2vec standard op ons except -size {300,500}
and -window 10
Surprise Languages
Two crosslingual approaches: training (1) on a mix of 23 languages and (2) on a typologically close language
(hsb → cs, sme → fi, kmr → fa, bxr → hi), both without word embeddings: (2) gave much be er results.
Example of modiﬁed CONLL (cols. 1, 2 and 4) used for training (i.e.
cs, shown below le ) and predic on (in this case hsb, below right):
training data (cs)
1 Manažeři NOUN
2 rozhodují VERB
3 ADV ADV
4 ADP ADP
5 místě NOUN
6 PUNCT PUNCT
test data (hsb)
1 Njejsu VERB
2 DET DET
3 archeologiske ADJ
4 dokłady NOUN
5 ADP ADP
…
23 language mix
Upper Sorbian (1001
) 63.2%
Northern Sami (50) 49.2%
Kurmanji (100) 29.2%
Buryat (50) 26.3%
Upper Sorbian (hsb)
cs (100) 69.5%
cs (50) 67.5%
pl (50) 56.9%
pl (100) 51.9%
Northern Sami (sme)
fi (100) 52.9%
fiu2
(100) 51.7%
fiu (50) 49.7%
fi (50) 50.8%
Kurmanji (kmr)
fa (100) 36.7%
fa (50) 35.8%
hi (50) 22.2%
ur (50) 20.6%
Buryat (bxr)
hi (50) 32.0%
ur (50) 28.0%
tr (100) 27.6%
fi (100) 21.8%
ja (50) 18.0%
1
Number indicates hidden layer size. 2
Mix of Fenno-Ugric languages, here fi, et and hu.
Results
• 10th
posi on with LAS 68.61% (improved to 69.75% a er bug ﬁxes)
• 9th
posi on with Content Word LAS (CLAS) as evalua on metrics: 64.15%
• 8th
posi on on surprise languages: 38.72% (7th
posi on with CLAS: 34.28%)
Run me (on Tira VM, Ubuntu Xenial): 3 hours (all treebanks), using <16GB
Contact: johannes.heinecke@orange.com

Contenu connexe

Plus de Association for Computational Linguistics

Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...

Association for Computational Linguistics

Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...

Association for Computational Linguistics

Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop

Association for Computational Linguistics

Chenchen Ding - 2015 - NICT at WAT 2015

Association for Computational Linguistics

John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...

Association for Computational Linguistics

John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...

Association for Computational Linguistics

Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...

Association for Computational Linguistics

Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...

Association for Computational Linguistics

Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015

Association for Computational Linguistics

Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop

Association for Computational Linguistics

Chenchen Ding - 2015 - NICT at WAT 2015

Association for Computational Linguistics

Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...

Association for Computational Linguistics

Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...

Association for Computational Linguistics

Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...

Association for Computational Linguistics

Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...

Association for Computational Linguistics

Toshiaki Nakazawa - 2015 - Overview of the 2nd Workshop on Asian Translation

Association for Computational Linguistics

Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT System

Association for Computational Linguistics

Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...

Association for Computational Linguistics

Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...

Association for Computational Linguistics

Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...

Association for Computational Linguistics

Plus de Association for Computational Linguistics (20)

Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...

Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...

Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop

Chenchen Ding - 2015 - NICT at WAT 2015

John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...

Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...

Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015

Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop

Chenchen Ding - 2015 - NICT at WAT 2015

Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...

Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...

Toshiaki Nakazawa - 2015 - Overview of the 2nd Workshop on Asian Translation

Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT System

Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...

Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...

Dernier

Klinik_ Apotek Onlin 085657271886 Solusi Menggugurkan Masalah Kehamilan Anda Jual Obat Aborsi Asli KLINIK ABORSI TERPEECAYA _ Jual Obat Aborsi Cytotec Misoprostol Asli 100% Ampuh Hanya 3 Jam Langsung Gugur || OBAT PENGGUGUR KANDUNGAN AMPUH MANJUR OBAT ABORSI OLINE" APOTIK Jual Obat Cytotec, Gastrul, Gynecoside Asli Ampuh. JUAL ” Obat Aborsi Tuntas | Obat Aborsi Manjur | Obat Aborsi Ampuh | Obat Penggugur Janin | Obat Pencegah Kehamilan | Obat Pelancar Haid | Obat terlambat Bulan | Ciri Obat Aborsi Asli | Obat Telat Bulan | Pil Aborsi Asli | Cara Menggugurkan Konten | Cara Aborsi Tuntas | Harga Obat Aborsi Asli | Pil Aborsi | Jual Obat Aborsi Cytotec | Cara Aborsi Sendiri | Cara Aborsi Usia 1 Bulan | Cara Aborsi Usia 2 Tahun | Cara Aborsi Usia 3 Bulan | Obat Aborsi Usia 4 Bulan | Cara Abrasi Usia 5 Bulan | Cara Menggugurkan Konten | Kandungan Obat Penggugur | Cara Menghitung Usia Konten | Cara Mengatasi Terlambat Bulan | Penjual Obat Aborsi Asli | Obat Aborsi Garansi | Kandungan Obat Peluntur | Obat Telat Datang Bulan | Obat Telat Haid | Obat Aborsi Paling Murah | Klinik Jual Obat Aborsi | Jual Pil Cytotec | Apotik Jual Obat Aborsi | Kandungan Dokter Abrasi | Cara Aborsi Cepat | Jual Obat Aborsi Bergaransi | Jual Obat Cytotec Asli | Obat Aborsi Aman Manjur | Obat Misoprostol Cytotec Asli. "APA ITU ABORSI" “Aborsi Adalah dengan membendung hormon yang di perlukan untuk mempertahankan kehamilan yaitu hormon progesteron, karena hormon ini dibendung, maka jalur kehamilan mulai membuka dan leher rahim menjadi melunak,sehingga mengeluarkan darah yang merupakan tanda bahwa obat telah bekerja || maksimal 1 jam obat diminum || PENJELASAN OBAT ABORSI USIA 1 _7 BULAN Pada usia kandungan ini, pasien akan merasakan sakit yang sedikit tidak berlebihan || sekitar 1 jam ||. namun hanya akan terjadi pada saatdarah keluar merupakan pertanda menstruasi. Hal ini dikarenakan pada usiakandungan 3 bulan,janin sudah terbentuk sebesar kepalan tangan orang dewasa. Cara kerja obat aborsi : JUAL OBAT ABORSI AMPUH dosis 3 bulan secara umum sama dengan cara kerja || DOSIS OBAT ABORSI 2 bulan”, hanya berbedanya selain mengisolasijanin juga menghancurkan janin dengan formula methotrexate dikandungdidalamnya. Formula methotrexate ini sangat ampuh untuk menghancurkan janinmenjadi serpihan-serpihan kecil akan sangat berguna pada saat dikeluarkan nanti. APA ALASAN WANITA MELAKUKAN ABORSI? Aborsi di lakukan wanita hamil baik yang sudah menikah maupun belum menikah dengan berbagai alasan , akan tetapi alasan yang utama adalah alasan-alasan non medis (termasuk aborsi sendiri / di sengaja/ buatan] MELAYANI PEMESANAN OBAT ABORSI SETIAP HARI, SIAP KIRIM KESELURUH KOTA BESAR DI INDONESIA DAN LUAR NEGERI. HUBUNGI PEMESANAN LEBIH NYAMAN VIA WA/: 085657271886

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

ZurliaSoop

Wizards are very useful for creating a good user experience. In all businesses, interactive sessions are most beneficial. To improve the user experience, wizards in Odoo provide an interactive session. For creating wizards, we can use transient models or abstract models. This gives features of a model class except the data storing. Transient and abstract models have permanent database persistence. For them, database tables are made, and the records in such tables are kept until they are specifically erased.

How to Create and Manage Wizard in Odoo 17

Celine George

On National Teacher Day, meet the 2024-25 Kenan Fellows

Mebane Rash

This PowerPoint helps students to consider the concept of infinity.

christianmathematics

Micro-Scholarship, What it is, How can it help me.pdf

Poh-Sun Goh

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx

annathomasp01

Jamworks pilot and AI at Jisc (20/03/2024)

Jisc

FSB Advising Checklist - Orientation 2024

Elizabeth Walsh

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx

Esquimalt MFRC

Mehran University Newsletter Vol-X, Issue-I, 2024

Mehran University of Engineering & Technology, Jamshoro

ICT Role in 21st Century Education & its Challenges.pptx

AreebaZafar22

Food safety_Challenges food safety laboratories_.pdf

Sherif Taha

ICT role in 21st century education and it's challenges.

MaryamAhmad92

Accessible Digital Futures project (20/03/2024)

Jisc

SOC 101 Demonstration of Learning Presentation

camerronhm

Wellbeing inclusion and digital dystopias.pptx

Jisc

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...

Pooja Bhuva

General Principles of Intellectual Property: Concepts of Intellectual Proper...

Poonam Aher Patil

Salient Features of India constitution especially power and functions

KarakKing

Basic Civil Engineering notes first year Notes Building notes Selection of site for Building Layout of a Building What is Burjis, Mutam Building Bye laws Basic Concept of sunlight ventilation in building National Building Code of India Set back or building line Types of Buildings Floor Space Index (F.S.I) Institutional Vs Educational Building Components & function Sills, Lintels, Cantilever Doors, Windows and Ventilators Types of Foundation AND THEIR USES Plinth Area Shallow and Deep Foundation Super Built-up & carpet area Floor Area Ratio (F.A.R) RCC Reinforced Cement Concrete RCC VS PCC

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx

Denish Jangid

Dernier (20)

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

How to Create and Manage Wizard in Odoo 17

On National Teacher Day, meet the 2024-25 Kenan Fellows

This PowerPoint helps students to consider the concept of infinity.

Micro-Scholarship, What it is, How can it help me.pdf

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx

Jamworks pilot and AI at Jisc (20/03/2024)

FSB Advising Checklist - Orientation 2024

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx

Mehran University Newsletter Vol-X, Issue-I, 2024

ICT Role in 21st Century Education & its Challenges.pptx

Food safety_Challenges food safety laboratories_.pdf

ICT role in 21st century education and it's challenges.

Accessible Digital Futures project (20/03/2024)

SOC 101 Demonstration of Learning Presentation

Wellbeing inclusion and digital dystopias.pptx

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...

General Principles of Intellectual Property: Concepts of Intellectual Proper...

Salient Features of India constitution especially power and functions

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx

Johannes Heinecke - 2017 - Multi-Model and Crosslingual Dependency Analysis

1. CoNLL 2017 UD Shared Task Mul -Model and Crosslingual Dependency Analysis Johannes Heinecke, Munshi Asadullah Orange Labs, Lannion, France Architecture BistParser modifica ons: One depencency tree per sentence Training: • hidden layer size 40, 50 or 100, depending on language • other BistParser op ons used: --k 3 --lstmdims 125 --lstmlayers 2 --bibi-lstm --usehead --userl • word embeddings for all languages (except Gothic) – all words in lowercase (if applicable) – punctua on separated from words – word2vec standard op ons except -size {300,500} and -window 10 Surprise Languages Two crosslingual approaches: training (1) on a mix of 23 languages and (2) on a typologically close language (hsb → cs, sme → fi, kmr → fa, bxr → hi), both without word embeddings: (2) gave much be er results. Example of modified CONLL (cols. 1, 2 and 4) used for training (i.e. cs, shown below le ) and predic on (in this case hsb, below right): training data (cs) 1 Manažeři NOUN 2 rozhodují VERB 3 ADV ADV 4 ADP ADP 5 místě NOUN 6 PUNCT PUNCT test data (hsb) 1 Njejsu VERB 2 DET DET 3 archeologiske ADJ 4 dokłady NOUN 5 ADP ADP … 23 language mix Upper Sorbian (1001 ) 63.2% Northern Sami (50) 49.2% Kurmanji (100) 29.2% Buryat (50) 26.3% Upper Sorbian (hsb) cs (100) 69.5% cs (50) 67.5% pl (50) 56.9% pl (100) 51.9% Northern Sami (sme) fi (100) 52.9% fiu2 (100) 51.7% fiu (50) 49.7% fi (50) 50.8% Kurmanji (kmr) fa (100) 36.7% fa (50) 35.8% hi (50) 22.2% ur (50) 20.6% Buryat (bxr) hi (50) 32.0% ur (50) 28.0% tr (100) 27.6% fi (100) 21.8% ja (50) 18.0% 1 Number indicates hidden layer size. 2 Mix of Fenno-Ugric languages, here fi, et and hu. Results • 10th posi on with LAS 68.61% (improved to 69.75% a er bug fixes) • 9th posi on with Content Word LAS (CLAS) as evalua on metrics: 64.15% • 8th posi on on surprise languages: 38.72% (7th posi on with CLAS: 34.28%) Run me (on Tira VM, Ubuntu Xenial): 3 hours (all treebanks), using <16GB Contact: johannes.heinecke@orange.com

Johannes Heinecke - 2017 - Multi-Model and Crosslingual Dependency Analysis

Recommandé

Recommandé

Contenu connexe

Plus de Association for Computational Linguistics

Plus de Association for Computational Linguistics (20)

Dernier

Dernier (20)

Johannes Heinecke - 2017 - Multi-Model and Crosslingual Dependency Analysis