Influencing policy (training slides from Fast Track Impact)
Electronic dictionaries in writing tools: user needs and models for user interaction
1. Electronic dictionaries in writing tools:
user needs and models for user interaction
Ulrich Heid
Universit¨at Hildesheim,
Institut f¨ur Informationswissenschaft und Sprachtechnologie,
Universit¨atsplatz,1 — D 31141 Hildesheim, Germany
Santiago de Compostela: Multilex-2015,
October 2015
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 1 / 32
2. Overview
• Framework: Lexicographic Function Theory
and its implications for e-dictionary making
• User needs:
• General aspects
• Needs in text production –
and proposals from the literature to satisfy them:
• Needs resulting from linguistic complexity
• Needs resulting from different levels of knowledge of users
• Models of interaction:
• Information on demand
• (New) Ways of presenting lexicographic data
• Conclusion: lessons learnt
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 2 / 32
3. Context — and Warning
Projects – cooperation
• This presentation does not contain anything new:
it just re-arranges and re-interprets recent work:
rather practical state of the art than abstract visions
• Based on cooperation in
SeLA – Scientific e-Lexicography for Africa:
Project funded by BMBF (05-2012 – 12-2015) and organized by DAAD
• University of Pretoria Theo Bothma – Daan Prinsloo – Elsab´e Taljard
• University of Stellenbosch Rufus H. Gouws
• UNISA, University of South Africa Sonja E. Bosch
• University of Namibia Herman Beyer
• University of Hildesheim Gertrud Faaß
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 3 / 32
4. Framework and reminder: Lexicographic Function Theory
Dictionaries as information tools Tarp 2008 etc.
• The dictionary provides data from which users can derive information
to satify a given need
• An “ideal” dictionary
provides the user with
exactly that
{ types | amount of... } data
which he/she needs
• Assumption in FT:
Lexicographers (should) know
what is best for a given user (type)
→ different types of (e-)dictionaries
→ different data offers
potential user
user situation
need for information lexicographical data
extraction of inform.satisfaction of needs
dictionary
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 4 / 32
5. Framework and reminder: Lexicographic Function Theory
Parameters influencing the process of information derivation Tarp 2008 etc.
• Needs of users arising in different situations:
• Cognitive needs: learn about “things” or words
• Communicative needs:
• Text production vs. text reception
• Monolingual vs. bilingual
• etc.
• Users’ pre-existing knowledge
• Knowledge of the targeted language
• Knowledge of the targeted domain (e.g. in specialized dictionaries)
• Knowledge about using the (e-) dictionary,
or, more generally,
about using electronic information tools
• Awareness of the use situation and needs
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 5 / 32
6. Implications of user needs and pre-existing knowledge
A view on the scenario of lexicography
• To satisfy different user needs,
lexicographers will collect large amounts of lexicographic data
• For each type of need and/or for each type of user,
a specific subset of the data will be needed
• Thus a filtering approach is necessary,
where the filter is defined
according to
user types and needs
user−1
user−2
user−n
dict−1
dict−2
dict−3
filterslexgr.
data
specifications
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 6 / 32
7. Implications of user needs and pre-existing knowledge
Lexicographic scenario: need for well-defined dictionary specifications
Dictionary plan Gouws 2013
• Lexicographic data categories:
• Must be clearly distingushed, categorized and marked up
• Must be presentable in different forms, Spohr2012
e.g. with different degrees of specialization, different metalanguage, etc.
• Filtering:
• By lexicographic function
• According to
pre-existing knowledge
→ Selection
of data categories
→ Selection
of presentation modes
user−1
user−2
user−n
dict−1
dict−2
dict−3
filterslexgr.
data
specifications
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 7 / 32
8. User needs: general aspects
Parameters relevant for data selection
• Lexicographic functions
• Text production ←→ text reception
• Elements of cognitive needs involved in a communicative situation:
learning while producing text – training for text production
• Properties of the targeted linguistic phenomena
• Lexicographic data categories needed for a given function:
words — word combinations — linguistic properties — ...
• Interaction of lexical objects with “grammar”
• Pre-existing knowledge in users
• Lexical items of the targeted language
• Linguistic properties of the targeted lexical items
• Grammatical knowledge of the targeted language
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 8 / 32
9. User needs in text production
Linguistic aspects
• Need to know a lexical object
• Access:
• From a “concept”
• Form a source language item
• Choice among alternatives, based on properties of each
• Need to insert le lexical object into an upcoming context:
construction — sentence — discourse — text (type) ...
• Access to linguistic properties of lexical objects,
on different levels of linguistic description
• Some properties may act as constraints and rule out certain options
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 9 / 32
10. User needs in text production
Levels of interactvity – interaction models Prinsloo, Bothma and Heid 2015
• Mainly interactive tools:
with different amounts of user interaction required
• Step-wise build-up of a construction or a sentence
• Guidance through options of lexical or grammatical choice
• Guidance with cognitively oriented elements:
lexical or grammatical explanations
• Mainly automatic tools:
User input triggers automatic processing
• Checking tools: Verlinde 2014 and ILT online
grammar checkers — style checkers — collocation checkers ...
• (Autoomatic) translation functions
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 10 / 32
11. Phenomenon-related needs: collocations as a case in point
An example of criteria for the selection of lexicographic data categories
• Notion of collocation underlying:
In the tradition of pedagogical lexicography Hausmann 2006, Mel’ˇcuk
• Lexically and/or pragmatically constrained,
language-specific: Bartsch 2004
FR prendre une douche ←→ IT fare la doccia
• Base plus collocate: {douche | doccia} ⊕ verb
• Syntactic relationship between base and collocate
• Lexicographic data needed: Gouws 2015
• Knowledge of the collocation:
preferred lexical combination
• Knowledge about the collocation:
properties relevant for its insetion into context
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 11 / 32
12. Phenomenon-related needs: collocations as a case in point
Types of knowledge about collocations relevant for text production – Examples
• Morphosyntax: e.g.
• Number preferences:
DE den Rechtswegsing. einschlagen ([to] take legal action)
←→ IT adire le vieplural legali
• Determination: IT fare la doccia, ([to] take a shower)
DE sein Veto einlegen ([to] veto)
• Syntactic valency: e.g.
[to] be in a position (+ to +INF)
DE in der Lage sein (+ zu + INF)
• Collocational preferences: e.g.
DE {scharfe|heftige|massiv(e)...} Kritik ¨uben ([to ]criticize severely)
• Pragmatic preferences: e.g. by text type:
FR medical experts: X accroˆıt le risque de X (X increases the risk of Y)
FR medical lay persons: X augmente le risque de X Wandji Tchami et al. 2015
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 12 / 32
13. Phenomenon-related needs: collocations as a case in point
Access to data on collocations
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
14. Phenomenon-related needs: collocations as a case in point
Access to data on collocations
Different scenarios
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
15. Phenomenon-related needs: collocations as a case in point
Access to data on collocations
• Text production: onomasiological access cf. Giacomini 2013
known searched for
base lemma + reading
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
16. Phenomenon-related needs: collocations as a case in point
Access to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
17. Phenomenon-related needs: collocations as a case in point
Access to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
18. Phenomenon-related needs: collocations as a case in point
Access to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in context
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
19. Phenomenon-related needs: collocations as a case in point
Access to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in context
plus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
20. Phenomenon-related needs: collocations as a case in point
An example: different kinds of access Data from OCDSE
Production
Reading 1: forward movement [military]
• ADJ + advance
- [speed] rapid ∼
- [agent] German ∼, Allied ∼, etc.
• V + advance
- [make] make an ∼on X
The regiment made an advance on the
enemy lines.
Reading 2: development (often in the plural)
• ADJ + advance
- [amount] considerable ∼; big ∼,
substantial ∼;
dramatic ∼, enormous ∼, great ∼,
spectacular ∼, tremendous ∼.
• V + advance
- [make] make ∼es (in/on) [plural!]
Reading 3: amount of money
• ADJ + advance
- [quantity] small ∼, large ∼ - [type] cash ∼
• V + advance
- [provide] give so. an ∼, pay so. an ∼
The university pays me an advance for this
business trip.
Reception
• Readings
(1) [military] forward movement
(2) development
(3) amount of money
• Typical adjectives
- Allied etc. (cf. German etc.) (1)
- big (=considerable) (2)
- cash (3)
- considerable (=big) (2)
- dramatic (2)
- German (cf. Allied, etc.) (1)
- great (2)
- important (1)
- large (3)
- notable (2)
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 14 / 32
21. Phenomenon-related needs: collocations as a case in point
Access to collocational data for text production
Proposal for onomasiological access — example Giacomini 2011: 263
• Search:
Base syntactic filter semantic filters
paura fear ⊕ PP (di) ⊕ cause
(= natural phenomenon)
• Result:
paura [...]
colloc:
paura ⊕ PP (di)
– causa:
elementi e fenomeni naturali:
paura del terremoto; paura del fuoco; ...
• Option for a comparison with collocations of quasi-synonyms:
paura del fuoco ↔ panico per il fuoco; *spavento, *ansia
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 15 / 32
22. Phenomenon-related needs: collocations as a case in point
A wireframe prototype for a collocation dictionary (1/3)
Step 1: Enter base lemma possibly with reading, if it is polysemous
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 16 / 32
23. Phenomenon-related needs: collocations as a case in point
A wireframe prototype for a collocation dictionary (2/3)
Step 2: Semantic selection
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 17 / 32
24. Phenomenon-related needs: collocations as a case in point
A wireframe prototype for a collocation dictionary (3/3)
Step 3: Syntactic selection
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 18 / 32
25. Needs due to different levels of pre-existing knowledge
Lexical selection as a complex decision task — Bantu languages
Copulatives in Northern Sotho:
how to translate [to] be (1/3) Bothma et al. 2013
• Linguistic parameters of the lexico-grammatical selection task:
• Lexical semantics: *3
Identifying Descriptive Associative
this is a letter this woman is clever he is (together) with Sara
ke lengwalo mosadi yo o bohlale o na le Sara
• Aktionsart-like: stative ←→ incohative *2
• Mood: indicative ←→ situative ←→ relative *3
• Person or noun class *(14+4)
• Positive ←→ negative *2
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 19 / 32
26. Needs due to different levels of pre-existing knowledge
Copulatives in Northern Sotho (2/3)
Model for stepwise guidance:
Lexical selection as a decision tree
A
B
C
D
E
F
G
? B or C
?
? F or G
D or E
• Choice points: A, B. C...
• Provides only relevant choices,
depending on prior selection(s)
• Presence of cognitively relevant data at each choice point:
Grammatical hints about the choice at hand — examples
→ A combination of dictionary and grammar,
with on-demand support for text production
• Systematic path to the solution
• Decision-relevant information provided:
• Options at each choice point (minimal amount of data)
• Grammatical hints and examples only if needed by the user
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 20 / 32
27. Needs due to different levels of pre-existing knowledge
Copulatives in Northern Sotho – sample steps (3a/3)
• Selecting stative vs. incohative copulative
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 21 / 32
28. Needs due to different levels of pre-existing knowledge
Copulatives in Northern Sotho – sample steps (3b/3)
• Selecting one of the readings of the copulative:
identifying – descriptive – associative
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 22 / 32
29. Needs due to different levels of pre-existing knowledge
Copulatives in Northern Sotho – sample steps (3c/3)
• Stative descriptive copulative selected,
selection among moods: indicative – situative – relative
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 23 / 32
30. Needs due to different levels of pre-existing knowledge
Copulatives in Northern Sotho – sample steps (3d/3)
• Almost all features selected –
remains noun class
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 24 / 32
31. Needs due to different levels of pre-existing knowledge
Copulatives in Northern Sotho – sample steps (3e/3)
• For noun class:
select positive vs. negated
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 25 / 32
32. Needs due to different levels of pre-existing knowledge
Combining data for communicative and cognitive needs
Learner-oriented tools for text production: Bosch/Faaß 2014
e-Zulu (and e-Xhosa) dictionary and grammar trainer Sanasi 2015
• Focus on the Zulu possessive construction:
• Lexical choice of nominals for possessor and possession
• Noun classes of possessor and possession
• Noun-class-dependent connector (expressing the possessive relation)
• Morphophonological adaptation rules
• Stepwise guidance on demand:
• Nominal lexemes can be input in Zulu or English
• Data about input by user or provided by system
the noun class and the connector
• etc.
• Reminder of rules on demand
→ From stepwise guidance to full translation
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 26 / 32
33. Needs due to different levels of pre-existing knowledge
e-Zulu dictionary (1/2) Bosch/Faaß 2014
• Input in English: rooms of hotel
• Choice options:
• Translation only
• Stepwise explanation of Zulu rules applied
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 27 / 32
34. Needs due to different levels of pre-existing knowledge
e-Zulu dictionary (2/2) Bosch/Faaß 2014
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 28 / 32
35. Needs due to different levels of pre-existing knowledge
Combining data for communicative and cognitive needs
Learner-oriented tools for text production: Prinsloo et al. 2014, 2015
Sepedi (= Northern Sotho) sentence builder for speakers of English
• Phenomena:
• Lexical selection: nominals, verbs
• Noun class system of Sepedi — concords and pronouns
• Grammatical rules for valency constructions, relative clauses, etc.
• Same principles as with Zulu possessives:
• On each step in text production, Individualization: Tarp 2011
user may decide whether and how much help to get from the tool
• User input may be in either English or Sepedi,
with option open at each step of the sentence construction
• Integratable with a large English → Sepedi dictionary
• Grammatical information on demand
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 29 / 32
36. Models for user interaction
Information on demand Bothma 2011
• Basic amount of data is available by default
• Additional data may be accessed via unfoldable items:
• Grammatical explanations in decision trees Bothma et al. 2013
• “Info” button in Sepedi sentence builder Prinsloo et al. 2015
• Option to see explanations inlearning tools Sanasi 2015
⇒ Open questions:
• Deciding beforehand profile-based dictionaries
about amount of data required
or deciding at each step in the text production process ?
• How much use is made by users of extra data offer? Trap-Jensen 2010
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 30 / 32
37. Models for user interaction
Linguistic complexity ←→ interactional simplicity
• Dilemma:
• Complex linguistic decision processes
may require complex descriptions Bantu languages – collocation selection
• But:
Many users want simple tools, easy to use:
• Few clicks
• Short explanations
• Little effort before getting to the result Heid/Zimmermann 2012
• Proposal:
• Providing guidance tools only on demand,
in addition to “standard” dictionary entries
• Maybe adding non-linear guidance devices, especially for learners:
• Graphical elements Runte 2015
• Interactive elements, for learners to explore linguistic phenomena
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 31 / 32
38. Models for user interaction
Graphical display of lexical relations Runte 2015
• Display of relationships between lexemes:
• Paradigmatic:
• Synonyms, Antonyms
• Hyp(er)onyms
• Syntagmatic:
• Typical adjectives
• Typical verbs, ...
<Qualifikation>
qualifiziert hochqualifiziert
Angestellter
Arbeiter
Erwerbstaetiger
Arbeitskraft
einstellen
beschaeftigen
kuendigen
arbeiten
Arbeit−
nehmer
• Analyzed
in eye-tracking studies:
presentation
works well for learners
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 32 / 32
39. Conclusion
Lessons learnt from overview of recent work
• Parameters relevant for
the design of dictionaries in writing support tools:
• Properties of targeted lexical objects:
Addressing linguistic complexity
• Pre-existing knowledge of users:
On lexical objects and their insertion inzo zext
• Flexibility wrt interaction models:
Combining automatic and interactive use
• Current approaches
• Constrained-based selection in collocations dictionary mainly from SeLA
• Stepwise guidance in decision trees
• Learners’ bilingual dictionaries with explanations
• Stepwise sentence builder:
Flexible amounts of support
• Graphical presentation
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 33 / 32
40. Future work
• User testing of prototypes,
to understand which approaches work best
• From mock-ups and prototypes
to tools with sizeable lexical resources:
• e-Zulu: several hundreds of items
• Spedi sentence builder: work towards large grammatical cov
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 34 / 32