The UniMorph Project and Morphological Reinflection Task: Past, Present, and Future
1. UniMorph and Morphological Inflection Task: Past, Present, and Future
Ekaterina Vylomova@
@
University of Melbourne
ekaterina.vylomova@unimelb.edu.au
20 августа 2021 г.
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 1 / 115
2. PART I: The UniMorph Project
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 2 / 115
4. Speech is Special
Charles F. Hockett on Essential Properties of Human Languages
Displacement
Ability to refer to things in space and time and communicate about things that are not present
Productivity
Ability to create new and unique meanings of utterances from previously existing utterances and
sounds
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 4 / 115
5. Speech is Special
Charles F. Hockett on Essential Properties of Human Languages
Duality of Patterning
Meaningless phonic segments (phonemes) are combined to make meaningful words, etc.
Learnability
A speaker of a language can learn another language
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 5 / 115
6. Linguistic Diversity
Roman Jacobson on Differences between Languages
«_»
“Languages differ essentially in what they must convey and not in what they may convey”
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 6 / 115
7. Languages differ in many ways!
(1) Chinese (Isolating)
wǒmen
I.PL.AN
xué
learn
le
.PAST
zhè
this
xiē
.PL
shēngcı́.
new word.
“We learned these new words.”
(2) Russian (Synthetic)
My
We.NOM
vyučili
learn.PAST.PL
eti
this.ACC.PL
novyje
new.ACC.PL
slova.
word.ACC.PL
“We learned these new words.”
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 7 / 115
8. Languages differ in many ways!
An example of West Greenlandic taken from Fortescue (2017):
(3) West Greenlandic (Polysynthetic)
Nannu-n-niuti-kkuminar-tu-
Polar.bear-catch-instrument.for.achieving-something.good.for-PART-
rujussu-u-vuq.
big-be-3SG.INDIC
“It (a dog) is good for catching polar bears with.”
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 8 / 115
9. Languages differ in many ways!
An example of Kunwinjku taken from Evans (2003):
(4) Kunwinjku (Polysynthetic)
Aban-yawoith-warrgah-marne-ganj-ginje-ng.
1/3PL-again-wrong-BEN-meat-cook-PP
“I cooked the wrong meat for them again”
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 9 / 115
10. Languages differ in many ways!
An example of Kunwinjku taken from Evans (2003):
(5) Kunwinjku (Polysynthetic)
Aban-yawoith-warrgah-marne-ganj-ginje-ng.
1/3PL-again-wrong-BEN-meat-cook-PP
“I cooked the wrong meat for them again”
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 10 / 115
Discussion of what should be considered as a word:
John Mansfield’s “The word as a unit of internal predictability”
11. Languages differ in many ways!
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 11 / 115
Some exhibit rich grammatical case systems (e.g., 12 in Erzya and 24 in Veps)
Some mark possessiveness
Others might have complex verbal morphology (e.g., Oto-Manguean languages)
Even “decline” nouns for tense (e.g., Tupi–Guarani languages)
12. Languages differ in many ways!
Let’s Discuss The Following Dimensions:
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 12 / 115
Fusion
Inflectional Synthesis
Position of Case Affixes
14. Fusion (WALS 20A)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 14 / 115
From isolating to concatenative
Concatenative morphology is the most common system
Non-linearities such as ablaut or tonal morphology can also be present
Isolating languages: the Sahel Belt in West Africa, Southeast Asia and the Pacific
Tonal–concatenative morphology can be found in Mesoamerican languages
15. Inflectional Synthesis of the Verb (WALS 22A)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 15 / 115
16. Inflectional Synthesis of the Verb (WALS 22A)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 16 / 115
Analytic expressions are common in Eurasia
Synthetic expressions are used to a high degree in the Americas
17. Position of Case Affixes (WALS 51A)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 17 / 115
18. Position of Case Affixes (WALS 51A)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 18 / 115
Can variably surface as prefixes, suffixes, infixes, or circumfixes
Suffixation: Most Eurasian and Australian languages
to a lesser extent in South American and New Guinean languages
Prefixation:Mesoamerican languages and African languages spoken below the Sahara
19. The Earliest Approach to Morphology (Sanskrit)
Pāņini’s karakas
Formalize regularities in the words
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 19 / 115
Inflectional Morphology is Paradigmatic
20. ..or Russian Morphology
Morphological Inflection
Formalize regularities in the words
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 20 / 115
Inflectional Morphology is Paradigmatic
Formalizations differ: The number of cases may vary from 6 to 11(Zaliznyak, 1967)
21. Inflectional Morphology: Paradigms (nouns)
Morphological Inflection
беглец “runner” + pos=N,case=ACC,num=SG → беглеца
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 21 / 115
ru-noun-table | b | беглец | a=an
22. Inflectional Morphology: Classes (nouns)
Morphological Inflection
беглец + pos=N,case=ACC,num=SG → беглеца
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 22 / 115
EN Wiktionary: ru-noun-table | b | беглец | a=an
23. Inflectional Morphology: Classes (nouns); *Differs in En/Ru Editions of
Wiktionary*
Morphological Inflection
беглец + pos=N,case=ACC,num=SG → беглеца
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 23 / 115
EN Wiktionary: ru-noun-table | b | беглец | a=an
RU Wiktionary:сущ ru m a 5b|основа=беглец|основа1=беглец|слоги=по-слогам|бег|лец
24. Inflectional Morphology: Wiktionary annotation is Not Cross-linguistically
Consistent
Other Languages
Hungarian
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 24 / 115
Wiktionary: Inconsistent annotation across languages
Within a single language: across different editions (en; ru; de; etc)
Many language-specific features
25. Linguistic Diversity and Universals
Universal Grammar
Evans and Levinson, 2009: The Myth of Language Universals
"Diversity can be found at almost every level of linguistic organization”
Languages vary greatly on phonological, morphological, semantic, and
syntactic levels
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 25 / 115
26. Linguistic Diversity and Universals
Universal Grammar
Evans and Levinson, 2009: The Myth of Language Universals
"Diversity can be found at almost every level of linguistic organization”
Languages vary greatly on phonological, morphological, semantic, and
syntactic levels
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 26 / 115
Typology: describe the limits of cross-linguistic variation
27. Linguistic Diversity and Universals
Universal Grammar
Evans and Levinson, 2009: The Myth of Language Universals
"Diversity can be found at almost every level of linguistic organization”
Languages vary greatly on phonological, morphological, semantic, and
syntactic levels
Haspelmath, 2010
Descriptive categories (specific to languages) vs. comparative concepts.
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 27 / 115
Typology: describe the limits of cross-linguistic variation
28. UniMorph – Universal Annotation
Universal Annotation (by John Sylak-Glassman and David Yarowsky)
1) 23 dimensions of meaning (TAM, case, number, animacy), 212 features
2) A-morphous (word-based) morphology (Anderson, 1992)
3) Initial paradigms were mainly extracted from English Edition of
Wiktionary (Kirov et al., 2016)
https://unimorph.github.io/
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 28 / 115
[Sylak-Glassman, 2016]
29. UniMorph – Universal Annotation
Universal Annotation (by John Sylak-Glassman and David Yarowsky)
1) 23 dimensions of meaning (TAM, case, number, animacy), 212 features
2) A-morphous (word-based) morphology (Anderson, 1992)
3) Initial paradigms were mainly extracted from English Edition of
Wiktionary (Kirov et al., 2016)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 29 / 115
[Sylak-Glassman, 2016]
30. UniMorph – Universal Annotation
Universal Annotation (by John Sylak-Glassman and David Yarowsky)
1) 23 dimensions of meaning (TAM, case, number, animacy), 212 features
2) A-morphous (word-based) morphology (Anderson, 1992)
3) Initial paradigms were mainly extracted from English Edition of
Wiktionary (Kirov et al., 2016)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 30 / 115
[Sylak-Glassman, 2016]
31. UniMorph – Universal Annotation
Universal Annotation (by John Sylak-Glassman and David Yarowsky)
1) 23 dimensions of meaning (TAM, case, number, animacy), 212 features
2) A-morphous (word-based) morphology (Anderson, 1992)
3) Initial paradigms were mainly extracted from English Edition of
Wiktionary (Kirov et al., 2016)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 31 / 115
[Sylak-Glassman, 2016]
32. PART II: SIGMORPHON Shared Tasks on Morphological (Re-)inflection
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 32 / 115
33. Morphological (Re-)Inflection
SIGMORPHON Shared Task 2016–2019
Inflection: PLAY + PRESENT PARTICIPLE → playing
ReInflection: played + PRESENT PARTICIPLE → playing
Lemma Tag Form
RUN PAST ran
RUN PRES;1SG run
RUN PRES;2SG run
RUN PRES;3SG runs
RUN PRES;PL run
RUN PART running
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 33 / 115
2018 :∼ 96% accuracy on avg.
in high-resource setting
But much less well
in low-resource setting
35. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) character-level input: <s> r u n OUT_POS=V OUT_NUM=SG
OUT_TENSE=PRES </s> Output: <s> r u n s </s>
2) Ensembles of seq2seq (GRUs + soft attention (Bahdanau et al., 2015))
3) Enriching the data with combinations of other (non-lemma) forms
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 35 / 115
[Kann and Schuetze, 2016]
36. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) character-level input: <s> r u n OUT_POS=V OUT_NUM=SG
OUT_TENSE=PRES </s> Output: <s> r u n s </s>
2) Ensembles of seq2seq (GRUs + soft attention (Bahdanau et al., 2015))
3) Enriching the data with combinations of other (non-lemma) forms
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 36 / 115
[Kann and Schuetze, 2016]
37. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) character-level input: <s> r u n OUT_POS=V OUT_NUM=SG
OUT_TENSE=PRES </s> Output: <s> r u n s </s>
2) Ensembles of seq2seq (GRUs + soft attention (Bahdanau et al., 2015))
3) Enriching the data with combinations of other (non-lemma) forms
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 37 / 115
[Kann and Schuetze, 2016]
38. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) character-level input: <s> r u n OUT_POS=V OUT_NUM=SG
OUT_TENSE=PRES </s> Output: <s> r u n s </s>
2) Ensembles of seq2seq (GRUs + soft attention (Bahdanau et al., 2015))
3) Enriching the data with combinations of other (non-lemma) forms
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 38 / 115
[Kann and Schuetze, 2016]
39. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) Extract input–output string alignments; 2) Train seq2seq (LSTM-based)
models to learn a sequence of operations (hard monotonic attention)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 39 / 115
[Aharoni and Goldberg, 2017]
40. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) Extract input–output string alignments; 2) Train seq2seq (LSTM-based)
models to learn a sequence of operations (hard monotonic attention)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 40 / 115
[Aharoni and Goldberg, 2017]
41. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) Extract input–output string alignments; 2) Train seq2seq (LSTM-based)
models to learn a sequence of operations (hard monotonic attention)
Errors
глядеть pos=V,tense=PRS,per=1,num=SG,aspect=IPFV gold: гляжу predicted: глядею
увлекаться pos=V,tense=PRS,per=1,num=SG,aspect=IPFV gold: увлекаюсь
predicted: увлеклюсь
звать pos=V,tense=PRS,per=3,num=SG,aspect=IPFV gold: зовёт predicted: звает
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 41 / 115
[Aharoni and Goldberg, 2017]
42. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) Extract input–output string alignments; 2) Train seq2seq (LSTM-based)
models to learn a sequence of operations (hard monotonic attention)
Errors
зять pos=N,case=GEN,num=PL gold: зятьёв predicted: зятей
перстень pos=N,case=GEN,num=PL gold: перстней predicted: перстеее
телекамера pos=N,case=GEN,num=PL gold: телекамер predicted: телекаморо
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 42 / 115
[Aharoni and Goldberg, 2017]
43. SIGMORPHON 2016 Shared Task (Cotterell et al., 2016)
Morphological (Re-)Inflection (10 Languages): Neural
encoder–decoders
1) Extract input–output string alignments; 2) Train seq2seq (LSTM-based)
models to learn a sequence of operations (hard monotonic attention)
Errors
лоботряс pos=N,case=ACC,num=PL gold: лоботрясов predicted: лоботрясы
львица pos=N,case=ACC,num=PL gold: львиц predicted: львица
милиционер pos=N,case=ACC,num=PL gold: милиционеров predicted: милиционеры
светлячок pos=N,case=ACC,num=PL gold: светлячков predicted: светлячки
скот pos=N,case=ACC,num=PL gold: скотов predicted: скоты
счёт pos=N,case=ACC,num=PL gold: счета predicted: счеты
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 43 / 115
[Aharoni and Goldberg, 2017]
44. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Universal Morphological Reinflection (52 Languages)
Task1: Morphological Inflection
Task2: Paradigm Completion
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 44 / 115
45. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Universal Morphological Reinflection (52 Languages)
3 Settings: Low (100 samples), Medium (1000), High (10,000)
Sampled based on their token frequency in Wikipedia corpus (with
resampling for syncretic slots)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 45 / 115
[Cotterell et al., 2017]
46. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Universal Morphological Reinflection (52 Languages)
3 Settings: Low (100 samples), Medium (1000), High (10,000)
Sampled based on their token frequency in Wikipedia corpus (with
resampling for syncretic slots)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 46 / 115
[Cotterell et al., 2017]
47. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Universal Morphological Reinflection (52 Languages): Neural
encoder–decoders
1) (Align & Copy): Based on Aharoni and Goldberg, 2017
2) Extract input–output string alignments (add COPY/edit operations) 2)
Train seq2seq (LSTM-based) models to learn a sequence of operations (hard
monotonic attention)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 47 / 115
[Makarov et al., 2017]
48. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Universal Morphological Reinflection (52 Languages)
3 Settings: Low (100 samples), Medium (1000), High (10,000)
Sampled based on their token frequency in Wikipedia corpus (with
resampling for syncretic slots)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 48 / 115
[Makarov et al., 2017]
49. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Error taxonomy
What are common errors that neural systems make?
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 49 / 115
[Gorman et al., 2019]
50. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Error taxonomy
What are common errors that neural systems make?
Types of Errors
Free variation error: more than one acceptable form exists
Extraction errors: flaws in UniMorph’s parsing of Wiktionary
Wiktionary errors: errors in the Wiktionary data itself
Silly errors: “bizarre” errors which defy any purely linguistic characterization (“*membled”
instead of “mailed” or enters a loop such as “ynawemaylmyylmyylmyylmyylmyylmyym...” instead
of “ysnewem”)
Allomorphy errors: misapplication of existing allomorphic patterns
Spelling errors: forms that do not follow language-specific orthographic conventions
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 50 / 115
[Gorman et al., 2019]
51. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Error taxonomy
What are common errors that neural systems make?
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 51 / 115
[Gorman et al., 2019]
52. CoNLL–SIGMORPHON 2017 Shared Task (Cotterell et al., 2017)
Error taxonomy
What are common errors that neural systems make?
Allomorphy Errors
Stem-final vowels in Finnish (*pohjanpystykorvojen); Consonant gradation in Finnish (*ei
kiemurda)
Ablaut in Dutch and German (*pront; *saufte)
Umlaut (*Einwohnerzähle, *Förmer), plural suffixes, Verbal prefixes in German (*umkehre)
Linking vowels in Hungarian (*masszázsakból instead of *masszázsokból)
Yers (*kle
˛sek instead of kle
˛sk), Genitive singular suffixes in Polish (*izotopa)
Animacy in Polish and Russian (грузин vs. магазин in ACC.SG )
Aspect in Russian (*будешь сорвать)
Internal inflection in Russian compounds (*государствах-донорах, *лёгких промышленности
(ACC.PL))
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 52 / 115
[Gorman et al., 2019]
53. CoNLL–SIGMORPHON 2018 Shared Task (Cotterell et al., 2018)
Universal Morphological Reinflection (103 Languages)
Task1: Morphological Inflection (Low, Medium, High)
Task2: Inflection in Context (Vylomova et al., 2019)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 53 / 115
[Cotterell et al., 2018]
54. CoNLL–SIGMORPHON 2018 Shared Task (Cotterell et al., 2018)
Universal Morphological Reinflection (103 Languages)
Task1: Morphological Inflection (Low, Medium, High)
Task2: Inflection in Context (Vylomova et al., 2019)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 54 / 115
[Cotterell et al., 2018]
55. CoNLL–SIGMORPHON 2018 Shared Task (Cotterell et al., 2018)
Universal Morphological Reinflection (103 Languages)
Task1: Morphological Inflection (Low, Medium, High)
Task2: Inflection in Context (Vylomova et al., 2019)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 55 / 115
[Cotterell et al., 2018]
56. CoNLL–SIGMORPHON 2018 Shared Task (Cotterell et al., 2018)
Universal Morphological Reinflection (103 Languages)
Task1: Morphological Inflection (Low, Medium, High)
Task2: Inflection in Context (Vylomova et al., 2019)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 56 / 115
[Cotterell et al., 2018]
Track 1: With morphosynt. annotation
Track 2: Without morphosynt. annotation
Requires to capture agreement and infer inherent vs. contextual categories (Vylomova et al., 2019)
57. SIGMORPHON 2019 Shared Task (McCarthy et al., 2019)
Morphological Analysis in Context and Cross-Lingual Transfer for
Inflection (100 Language Pairs)
Task1: Cross-lingual Transfer for Morphological Inflection (10k HR +100 LR)
Task2: Morphological Analysis in Context
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 57 / 115
[McCarthy et al., 2019]
58. SIGMORPHON 2019 Shared Task (McCarthy et al., 2019)
Morphological Analysis in Context and Cross-Lingual Transfer for
Inflection (100 Language Pairs)
Task1: Cross-lingual Transfer for Morphological Inflection (10k HR +100 LR)
Task2: Morphological Analysis in Context
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 58 / 115
[McCarthy et al., 2019]
59. SIGMORPHON 2019 Shared Task (McCarthy et al., 2019)
Morphological Analysis in Context and Cross-Lingual Transfer for
Inflection (100 Language Pairs)
Task1: Cross-lingual Transfer for Morphological Inflection (10k HR +100 LR)
Task2: Morphological Analysis in Context
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 59 / 115
[McCarthy et al., 2019]
60. SIGMORPHON 2019 Shared Task (McCarthy et al., 2019)
Morphological Analysis in Context and Cross-Lingual Transfer for
Inflection (100 Language Pairs)
Task1: Cross-lingual Transfer for Morphological Inflection (10k HR +100 LR)
Task2: Morphological Analysis in Context
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 60 / 115
[Anastasopoulos and Neubig, 2019]
61. SIGMORPHON 2019 Shared Task (McCarthy et al., 2019)
Morphological Analysis in Context and Cross-Lingual Transfer for
Inflection (100 Language Pairs)
Task1: Cross-lingual Transfer for Morphological Inflection (10k HR +100 LR)
Task2: Morphological Analysis in Context
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 61 / 115
[Anastasopoulos and Neubig, 2019]
62. SIGMORPHON 2019 Shared Task (McCarthy et al., 2019)
Morphological Analysis in Context and Cross-Lingual Transfer for
Inflection (100 Language Pairs)
Task1: Cross-lingual Transfer for Morphological Inflection (10k HR +100 LR)
Task2: Morphological Analysis in Context
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 62 / 115
[Anastasopoulos and Neubig, 2019]
63. So...
SIGMORPHON Shared Tasks 2016–2019
PLAY + PRESENT PARTICIPLE → playing
played + PRESENT PARTICIPLE → playing
Lemma Tag Form
RUN PAST ran
RUN PRES;1SG run
RUN PRES;2SG run
RUN PRES;3SG runs
RUN PRES;PL run
RUN PART running
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 63 / 115
2018 :∼ 96% accuracy on avg.
in high-resource setting
But much less well
in low-resource setting
64. So...
SIGMORPHON Shared Tasks 2016–2019
PLAY + PRESENT PARTICIPLE → playing
played + PRESENT PARTICIPLE → playing
Lemma Tag Form
RUN PAST ran
RUN PRES;1SG run
RUN PRES;2SG run
RUN PRES;3SG runs
RUN PRES;PL run
RUN PART running
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 64 / 115
Also see Ling Liu’s 2021 Overview
“Computational Morphology with Neural Network Approaches”
2018 :∼ 96% accuracy on avg.
in high-resource setting
But much less well
in low-resource setting
65. PART III: Scaling up and increasing UniMorph Collaboration!
From Wiktionary to more linguistic resources: Including grammar books, Apertium data,
text/glossed corpora.
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 65 / 115
66. Language-Specific Biases
As Bender(2009, 2016) notes architectures and training and tuning
algorithms still present language-specific biases
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 66 / 115
67. SIGMORPHON 2020 SHARED TASK 0 (Vylomova et al., 2020)
As Bender(2009, 2016) notes architectures and training and tuning
algorithms still present language-specific biases
Let’s focus on typological diversity and aim to investigate systems’ ability to
generalize across typologically distinct languages!
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 67 / 115
68. SIGMORPHON 2020 SHARED TASK 0 (Vylomova et al., 2020)
As Bender(2009, 2016) notes architectures and training and tuning
algorithms still present language-specific biases
Let’s focus on typological diversity and aim to investigate systems’ ability to
generalize across typologically distinct languages!
If a model works well for a sample of IE languages, should the same model
also work well for Tupi–Guarani languages?
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 68 / 115
69. SIGMORPHON 2020 SHARED TASK 0 (Vylomova et al., 2020)
90 Languages from 13 languages families
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 69 / 115
70. Three Phases
Development
2 months; train & dev: 45 languages from 5 families (Austronesian, Niger-Congo, Oto-Manguean,
Uralic, IE)
Generalization
1 week; train & dev: 45 languages from 10 families ( Afro-Asiatic, Algic, Dravidian,
Indo-European, Niger-Congo, Sino-Tibetan, Siouan, Songhay, Southern Daly, Tungusic, Turkic,
Uralic, and Uto-Aztecan)
Evaluation
1 week; test: all 90 languages
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 70 / 115
71. Data
Preparation
Manually converted their features (tags) into the UniMorph format
Canonicalized (https://github.com/unimorph/um-canonicalize) the converted language
data
Splitting
Used only noun, verb, and adjective forms to construct training, development, and evaluation
sets.
Randomly sampled 70%, 10%, and 20% for train, development, and test, respectively.
Zarma, Tajik, Lingala, Ludian, Māori, Sotho, Võro, Anglo-Norman, and Zulu contain less than
400 training samples
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 71 / 115
73. Systems: Baselines
Neural
Neural transducer (Wu et al, 2019), which is essentially a hard monotonic attention model
(mono-*)
Transformer adopted for character-level tasks Wu et al, (2020; trm-*), SoTA on ST 2017
+ data augmentation technique used by Anastasopoulos et al. (2019;-aug-)
+ family-wise shared parameters (*-shared)
Team Description System Model Features
Neural Ensemble Multilingual Hallucination
Baseline wu2019exact
mono-single
mono-aug-single
mono-shared
mono-aug-shared
wu2020applying
trm-single
trm-aug-single
trm-shared
trm-aug-shared
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 73 / 115
74. Systems: Teams
10 teams submitted 22 systems in total, out of which 19 were neural
Team Description System Model Features
Neural Ensemble Multilingual Hallucination
CMU Tartan Jayarao et al.(2020)
cmu_tartan_00-0
cmu_tartan_00-1
cmu_tartan_01-0
cmu_tartan_01-1
cmu_tartan_02-1
CU7565 Beemer et al. (2020)
CU7565-01-0
CU7565-02-0
CULing Liu et al. (2020) CULing-01-0
DeepSpin Peterset al. (2020)
deepspin-01-1
deepspin-02-1
ETH Zurich Forster et al. (2020)
ETHZ00-1
ETHZ02-1
Flexica Scherbakov (2020)
flexica-01-0
flexica-02-1
flexica-03-1
IMS Yu et al. (2020) IMS-00-0
LTI Murikinati et al. (2020) LTI-00-1
NYU-CUBoulder Singer et al. (2020)
NYU-CUBoulder-01-0
NYU-CUBoulder-02-0
NYU-CUBoulder-03-0
NYU-CUBoulder-04-0
UIUC Canby et al. (2020) uiuc-01-0
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 74 / 115
75. Systems: Description (* – winning system)
Improving neural baselines
*UIUC: transformers with synchronous bidirectional decoding technique (Zhou et al.,2019)
and family-wise fine-tuning
ETH Zurich: exact decoding strategy that uses Dijkstra’s search algorithm
Improving previous years’ models: Hard Monotonic Attention
IMS: L2R+R2L models with a genetic algorithm for ensemble search and data hallucination
Flexica:multilingual (family-wise) model with improved alignment strategy
+ new data hallucination technique based on phonotactic modelling
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 75 / 115
76. Systems: Description (* – winning system)
Improving their 2019 models
LTI: multi-source encoder–decoder with two-step attention architecture + cross-lingual
transfer+ data hallucination + romanization of scripts
*DeepSpin: massively multilingual (all languages) gated sparse two-headed attention model
with sparsemax
+ 1.5-entmax
Transformer vs. LSTMs
CMU Tartan: compared trasformer- and LSTM-based encoder–decoders trained mono- and
multilingually with data hallucination
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 76 / 115
77. Systems: Description (* – winning system)
Ensembles of Transformers
NYU-CUBoulder: compared vanilla and pointer-generator (monolingual) transformers
+ ensembles of three and five pointer-generator transformers + data hallucination (less than
1,000 samples)
*CULing: ensemble of three (monolingual) transformers + augmented the initial input (that
only used the lemma as a source form) with entries corresponding to other (non-lemma) slots
(reinflection) to improve learning of principal parts of paradigm
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 77 / 115
78. Systems: Description (* – winning system)
Non-neural systems
CU7565: manually developed finite-state grammars for 25 languages
+ hierarchical paradigm clustering (based on similarity of string transformation rules)
Flexica: a method similar to Hulden (2014) but with transformation rules treated
independently and assigned a score based on their frequency, specificity and diversity of
surrounding characters
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 78 / 115
79. Evaluation
Per-language accuracy
Per-language Levenstein distance
Takes into account the statistical significance of differences between systems
Ranking
Any system which is the same (as assessed via statistical significance) as the best performing one
is also ranked 1st for that language.
For genus/family:
We aggregate the systems’ ranks and re-rank them based on the amount of times they ranked
1st, 2nd, etc.
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 79 / 115
81. Results: 4 winning systems (outperform baselines)
uiuc-01-0 2.4 90.5
deepspin-02-1 2.9 90.9
BASE: trm-single 2.8 90.1
CULing-01-0 3.2 91.2
deepspin-01-1 3.8 90.5
BASE: trm-aug-single 3.7 90.3
NYU-CUBoulder-04-0 7.1 88.8
NYU-CUBoulder-03-0 8.9 88.8
NYU-CUBoulder-02-0 8.9 88.7
IMS-00-0 10.6 89.2
NYU-CUBoulder-01-0 9.6 88.6
BASE: trm-shared 10.3 85.9
BASE: mono-aug-single 7.5 88.8
cmu_tartan_00-0 8.7 87.1
BASE: mono-single 7.9 85.8
cmu_tartan_01-1 9.0 87.1
BASE: trm-aug-shared 12.5 86.5
BASE: mono-shared 10.8 86.0
cmu_tartan_00-1 9.4 86.5
LTI-00-1 12.0 86.6
BASE: mono-aug-shared 12.8 86.8
cmu_tartan_02-1 10.6 86.1
cmu_tartan_01-0 10.9 86.6
flexica-03-1 16.7 79.6
ETHZ-00-1 20.1 75.6
*CU7565-01-0 24.1 90.7
flexica-02-1 17.1 78.5
*CU7565-02-0 19.2 83.6
ETHZ-02-1 17.0 80.9
flexica-01-0 24.4 70.8
Oracle (Baselines) 96.1
Oracle (Submissions) 97.7
Oracle (All) 97.9
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 81 / 115
The baselines and the submissions are complementary
adding them together increases the oracle scored
The largest gaps in oracle systems are observed in Algic, Oto-Manguean
Sino-Tibetan, Southern Daly, Tungusic, and Uto-Aztecan families
82. Accuracy by language averaged across all submissions
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 82 / 115
83. Accuracy by language averaged across all submissions
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 83 / 115
A significant effect of dataset size was observed
Relatively easy: Austronesian and Niger-Congo
Difficult: some Uralic and Oto-Manguean languages
Challenging: Ludic, Norwegian Nynorsk, Middle Low German , Evenki, and O’odham
84. Accuracy by Language
Has morphological inflection become a solved problem in certain scenarios?
We have classified test examples into four categories:
Very Easy: all submitted systems got correct
Easy: predicted correctly by 80% of systems
Hard: predicted correctly by 20% of systems
Very Hard: none submitted systems got correct
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 84 / 115
88. Questions Addressed in Papers
Is developing morphological grammars manually worthwhile?
CU7565 manually designed finite-state grammars for 25 languages
Paradigms of some languages were relatively easy to describe but neural networks also
performed quite well
For Ingrian and Tagalog (LRL) grammars demonstrate superior performance but this comes at
the expense of a significant amount of person-hours
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 88 / 115
89. Questions Addressed in Papers
What is the best training strategy for low-resource languages?
Hallucinated data highlighted its utility for LRLs.
Augmenting the data with tuples where lemmas are replaced with non-lemma forms and their
tags
Multilingual training
Ensembles
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 89 / 115
90. Error Analysis
Systematic Errors:
Data Inconsistency
The train, development and test sets contain 2%, 0.3%, and 0.6% inconsistent entries
Highest rates: Azerbaijani, Old English, Cree, Danish, Middle Low German , Kannada,
Norwegian Bokmål, Chichimec, and Veps
Dialectal variations in Finno-Ugric and Tungusic
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 90 / 115
91. Language-Specific Errors
Algic (Cree)
Mean accuracy across systems was 65.1% (41.5% to 73%)
Struggled with the choice of preverbal auxiliary ( ‘kitta’ could refer to future, imperfective, or
imperative)
The paradigms were very large, there were very few lemmas (28 impersonal verbs and 14
transitive verbs
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 91 / 115
92. Language-Specific Errors
Austronesian
Mean accuracy across systems was 80.5% (39.5% to 100%)
Baseline: Cebuano (84%) and Hiligaynon (96%)
Cebuano only has partial reduplication while Hiligaynon has full reduplication
The prefix choice for Cebuano is more irregular, making it more difficult to predict the correct
conjugation of the verb
In Maori passive voice endings are difficult to predict as the language has undergone a loss of
word-final consonants and there is no clear link between a stem and the passive suffix that it
employs
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 92 / 115
93. Language-Specific Errors
Niger-Congo
Mean accuracy across systems was very good at 96.4 (62.8% to 100%)
Most languages in this family are considered low resource, and the resources used for data
gathering may have been biased towards the languages’ regular forms
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 93 / 115
94. Language-Specific Errors
Sino–Tibetan (Tibetan)
Mean accuracy across systems was average at 82.1%(67.9% to 85.1%)
Majority of errors are related to allomorphy
Nonce words and impossible combinations of component units (Di et al., 2019)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 94 / 115
95. Language-Specific Errors
Siouan (Dakota)
Mean accuracy across systems was average at 89.4%(0% to 95.7%)
Variable prefixing and infixing of person morphemes, along some complexities related to
fortition processes
Determining the factor(s) that governed variation in affix position was difficult from a
linguist’s perspective, though many systems were largely successful
Issues with first and second person singular allomorphy
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 95 / 115
96. Language-Specific Errors
Tungusic (Evenki)
Mean accuracy across systems was average at 53.8% (43.5% to 59.0%)
The dataset was created from oral speech samples in various dialects of the language; there
was little attempt at any standardization in the oral speech transcription
Annotation: various past tense forms are all annotated as PST, or there are several comitative
suffixes all annotated as COM
Annotation: some features are present in the word form but they receive no annotation at all
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 96 / 115
97. Language-Specific Errors
Uto-Aztecan (O’odham)
Mean accuracy across systems was average at 76.4% (54.8% to 82.5%)
Systems with higher accuracy may have benefited from better recall of suppletive forms
relative to lower accuracy systems.
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 97 / 115
98. SM2020ST0 (Vylomova et al., 2020): Conclusion
AND.....TO CONCLUDE:
Submissions were able to make productive use of multilingual training
Data augmentation techniques such as hallucination helped
Combined with architecture tweaks like sparsemax, it resulted in excellent overall performance
on many languages
Some morphology types and language families (Tungusic, Oto-Manguean, Southern Daly) are
still challenging
In some languages (Ingrian, Tajik, Tagalog, Zarma, and Lingala) hand-encoding linguist
knowledge in finite state grammars resulted in best performance
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 98 / 115
99. A Case Study on Nen (Papua New Guinea); Muradoglu et al., 2020
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 99 / 115
100. A Case Study on Nen (Papua New Guinea); Muradoglu et al., 2020
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 100 / 115
Spoken in the village of Bimadbn in the Western Province of PNG, by approx 400 people
Verbs: prefixing, middle, and ambifixing
Distributed Exponence (DE); “morphosyntactic feature values can only be
determined after unification of multiple structural positions
101. A Case Study on Nen (Papua New Guinea); Muradoglu et al., 2020
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 101 / 115
102. A Case Study on Nen (Papua New Guinea); Muradoglu et al., 2020
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 102 / 115
Low accuracy on small number of samples (<1000)
103. A Case Study on Nen (Papua New Guinea); Muradoglu et al., 2020
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 103 / 115
Low accuracy on small number of samples (<1000)
Allomorphy: vowel harmony
Variation in forms/spelling
Looping: *ynawemaylmyylmyylmyylmy-ylmyylmyymayamawemyymamya
Shcherbakov et al., 2020
104. A Case Study on Nen (Papua New Guinea); Muradoglu et al., 2020
How well do the models generalize?
Syncretism Test: all the TAM categories exhibit syncretism across the second and third-person
singular actor. Exception: The past perfective slot (where they take different forms)
Not observing the past perfective forms, systems tend to predict the forms as syncretic
(generalizing from observed slots), resulting in the misprediction of the actual forms (exceptions)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 104 / 115
105. SIGMORPHON 2021 Shared Task 0 (Pimentel, Ryskina et al., 2021): More
under-resourced languages!
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 105 / 115
106. SIGMORPHON 2021 Shared Task 0 (Pimentel, Ryskina et al., 2021): More
under-resourced languages!
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 106 / 115
107. SIGMORPHON 2021 Shared Task 0 (Pimentel, Ryskina et al., 2021)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 107 / 115
108. SIGMORPHON 2021 Shared Task 0 (Pimentel, Ryskina et al., 2021)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 108 / 115
Allomorphy
Spelling errors
Multi-Word Lemmas
Complex transformation patterns
109. SIGMORPHON 2021 Shared Task 0 (Pimentel, Ryskina et al., 2021)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 109 / 115
Allomorphy
Spelling errors
Most errors are due to limited data
Very sparse data w/o complete paradigms (e.g.,Eibela)
Misprediction in unseen lemmas (also see Goldman et al., 2021)
Multi-Word Lemmas
Complex transformation patterns
110. Language-Specific Errors
Russian
Mean accuracy across systems was average at 97.4%(94.31% to 98.06%)
Incorrect prediction of the instrumental case forms (even when the other parts of the same
paradigm observed (for the same lemma))
Incorrect prediction of the accusative forms. The forms are different for animate and inanimate
nouns, and animacy should be inferred (from observing other slot of the same case such as PL or
SG)
Errors in inflection of multi-word lemmas that require to infer dependency information.
Similarly, to the above cases, the information could be inferred from other slots of the same
paradigm
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 110 / 115
111. Language-Specific Errors
Kunwinjku
Accuracy across systems ranges from 14.75% to 63.93%
Due to limited amount of data, augmentation significantly improved the performance
Systems mispredict *ngurriborlbme instead of ngurriborle.
looping effects (Shcherbakov et al., 2020) are observed in RNN-based architectures:
*ngar-rrrrrrrrrrrrrmbbbijj (should be karribelbmerrinj), ngadjarridarrkddrrdddrrmerri (should be
karriyawoyhdjarrkbidyikarrmerrimeninj)
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 111 / 115
112. PART IV: Current Challenges and Future Directions
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 112 / 115
113. Challenges in Data Conversion/Annotation
Challenges in Data Conversion/Annotation
Case compounding and stacking (e.g., Kayardild)
I gave the book to my brother’s wife: ‘wife+DAT+ABL, my+GEN+DAT+ABL,
brother+GEN+DAT+ABL’
Clitics: exponential growth of paradigm tables
Polysynthetic languages and paradigms
Derivation – Inflection continuum: some paradigms contain
derivations (participle formation, masdars, etc) and require multi-step transformation
(PL: similar to ‘to run’ → ‘runners’ ).
Multi-word lemmas that might require dependency information
Which features should be added (not language-specific)?
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 113 / 115
114. Future Directions
Future Directions
Develop a framework for error analysis, e.g. measuring %-ge of allomorphy errors by providing a
set of tasks specifically for allomorphy (e.g., following Elsner and Sims, 2019; Malouf et al., 2020)
Increase interpretability of the models, design a methodology to extract the patterns learned by
the model
Make more typologically plausible language samples
A pipeline to augment UniMorph with new morphosyntatic features
An approach to estimate how representative a paradigm sample for a specific language is
(estimate of the language coverage)
... And ST0 Part 2: Human-like generalization and WUGS!
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 114 / 115
115. Thank you! Questions?
Please join us: https://groups.google.com/g/unimorph
Ekaterina Vylomova UniMorph and Morphological Inflection Task 20 августа 2021 г. 115 / 115