This presentation was given at the 28th Annual Conference of the Cognitive Science Society, 2006.
It presents a new model of early stage language acquisition, going from the emergence of first words to syntactic rules.
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
Staged Computational Modelling of Linguistic Development
1. From Syllables to Syntax:
Investigating Staged Linguistic Development through
Computational Modelling
Kris Jack, Chris Reed, and Annalu Waller
[kjack|chris|awaller]@computing.dundee.ac.uk
Applied Computing, University of Dundee,
Dundee, DD1 4HN, Scotland
2. Staged Language Acquisition
• Language acquisition is consistently described in stages
• Lexical and syntactic acquisition strategies must operate
within a unified model
• The Model
– Training Data
– Initial Assumptions
– Lexical Acquisition
– Syntactic Acquisition
– Comprehension
• Results
Holophrastic Early Multi-
Pre-linguistic Stage Late Multi-word Stage Abstract Stage
Stage word Stage
0 months 6 months 12 months 18 months 24 months 30 months 36 months 42 months
3. Staged Language Acquisition
• Language acquisition is consistently described in stages
• Lexical and syntactic acquisition strategies must operate
within a unified model
• The Model
– Training Data
– Initial Assumptions
– Lexical Acquisition
– Syntactic Acquisition
– Comprehension
• Results
Holophrastic Early Multi-
Pre-linguistic Stage Late Multi-word Stage Abstract Stage
Stage word Stage
0 months 6 months 12 months 18 months 24 months 30 months 36 months 42 months
4. Lexical Acquisition
• Siskind
• Steels
• Regier Siskind (1996)
• Cross-situational analysis
– Relationship between the appearance
and words and their referents
5. Lexical Acquisition
• Siskind
• Steels
• Regier Steels (2001)
• Language games
– Social pressure to communicate
within a community of agents can
lead to an emergent and shared
vocabulary
7. Syntactic Acquisition
• Roy
• Elman
• Kirby Roy (2002)
• Trained a grounded robot to play a
‘show-and-tell’ game
– Training data were divided into
Data
0>t>x simple and complex descriptions
x>t>y
Data Model
y>t>z
Data
8. Syntactic Acquisition
• Roy
• Elman
• Kirby Elman (1993)
• Incremental Learning
– Mechanistic changes can lead to
changes in behaviour
t>0
Module
t>x Model
Data Module
t>y
Module
9. Syntactic Acquisition
• Roy
• Elman
• Kirby Kirby (2002)
• Iterated Learning
– Languages with increasing
complexity can emerge across
generations of agents
Data Data
Model Model Model
10. Question
Can we develop a unified model that performs
staged language acquisition where:
1. The learning mechanisms are constant AND
2. Exposure to training data is constant?
11. Bridging the Gap
between Words and Syntax
• Jack, Reed, and Waller (2004)
– Shift from holophrastic to syntactic language
– The shift was unrealistic as it appeared very early
• A form of substitution was employed (similar to Harris (1966); Wolff
(1988); Kirby (2002); van Zaanen (2002))
• If the model encountered A B and A C then B and C were considered
substitutable for one another
– Given the two rules:
» S/eats(john, cake) → johneatscake
» S/eats(mary, cake) → maryeatscake
– Three rules were derived:
» S/eats(x, cake) → N/x eatscake
» N/john → john
» N/mary → mary
• This is a reasonable, yet powerful, form of syntactic learning
– The target language was unrealistically simple (two-word sentences)
12. Training Data
• Played the Scene Building Game
– Based on the Miniature Language Acquisition
Problem (Feldman et al., 1990)
– Aim; describe a visual event so that someone else
can recreate the event based on the description
→ → →
t=1 t=2 t=3 t=4
13. Training Data
• Played the Scene Building Game
– Based on the Miniature Language Acquisition
Problem (Feldman et al., 1990)
– Aim; describe a visual event so that someone else
can recreate the event based on the description
a red square has
appeared
→ → →
t=1 t=2 t=3 t=4
14. Training Data
• Played the Scene Building Game
– Based on the Miniature Language Acquisition
Problem (Feldman et al., 1990)
– Aim; describe a visual event so that someone else
can recreate the event based on the description
a pink cross to the
upper right of the
red circle
→ → →
t=1 t=2 t=3 t=4
15. Training Data
• Played the Scene Building Game
– Based on the Miniature Language Acquisition
Problem (Feldman et al., 1990)
– Aim; describe a visual event socross onsomeone else
a blue that
can recreate the event based otherthe of
the on side description
the red circle
→ → →
t=1 t=2 t=3 t=4
16. Training Data
• Played the Scene Building Game
– Based on the Miniature Language Acquisition
Problem (Feldman et al., 1990)
– Aim; describe a visual event so that someone else
can recreate the event based on the description
another red circle
under the pink
cross
→ → →
t=1 t=2 t=3 t=4
17. Training Data
• The task was surprisingly complex
– Linguistically
– Conceptually
• An artificial language was constructed based on a simplified problem
– Describes the appearance of the second object in a scene
– Retained the determiner distinction
– Can create sentences such as “a red square a bove the green cir cle” and “a blue
tri ang gle to the low er left of the pink star”
S = NP1 REL NP2 REL = REL1 | REL2
NP1 = a NP REL1 = a bove | be low | to the REL4
NP2 = the NP REL2 = REL3 REL4
NP = COLOUR SHAPE REL3 = to the low er | to the u pper
REL4 = left of | right of
COLOUR = black | blue | grey | green | pink | SHAPE = cir cle | cross | dia mond | heart | rec
black | red | white tang gle | star | square | tri ang gle
18. Initial Assumptions
• Joint attention is established at around one-year-old
(Tomasello, 1995)
• Receives <event, description> pairs
– An event is a set of six feature tuples
– A description is a string
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
→
t=1 t=2
“a pink cross to the u pper right of the red cir cle”
19. Initial Assumptions
• Sensitivity to data
– Children can identify objects through displacement during
motion (Kellman et al., 1987).
– Children can use shape and colour to differentiate between
objects (e.g. Landau et al., 1988)
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
→
t=1 t=2
“a pink cross to the u pper right of the red cir cle”
20. Initial Assumptions
• Sensitivity to data
– Children show sensitivity to the relative spatial
relationships between objects, making distinctions between
left and right, and above and below (Quinn, 2003)
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
→
t=1 t=2
“a pink cross to the u pper right of the red cir cle”
21. Initial Assumptions
• Sensitivity to data
– Children can perform analogies (Gentner and Medina,
1998)
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
→
t=1 t=2
“a pink cross to the u pper right of the red cir cle”
22. Initial Assumptions
• Sensitivity to data
– Children can determine transitional probabilities between
syllables (Saffran, Aslin, and Newport, 1996)
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
→
t=1 t=2
“a pink cross to the u pper right of the red cir cle”
23. The Model
• Training the model
– The Lexical Analysis Unit
• Discovers string-meaning associations
– The Syntactic Analysis Unit
• Discovers compositional relationships
24. The Lexical Analysis Unit
<event, description> pairs are compared through
a form of cross-situational analysis
<event, description>#1
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
“a pink cross to the u pper right of the red cir cle”
<event, description>#2
{<green, (1)>, <circle, (1)>, <red, (2)>, <diamond, (2)>, <even_vertical, (0)>, <right, (0)>}
“a red dia mond to the right of the green cir cle”
25. The Lexical Analysis Unit
Feature tuple comparisons are value sensitive and object
identifier insensitive. Two feature tuples, <v1, (o1)>
and <v2, (o2)>, are equivalent iff v1 = v2
<event, description>#1
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
“a pink cross to the u pper right of the red cir cle”
<event, description>#2
{<green, (1)>, <circle, (1)>, <red, (2)>, <diamond, (2)>, <even_vertical, (0)>, <right, (0)>}
“a red dia mond to the right of the green cir cle”
26. The Lexical Analysis Unit
Co-occurring syllable sequences are found
<event, description>#1
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
“a pink cross to the u pper right of the red cir cle”
<event, description>#2
{<green, (1)>, <circle, (1)>, <red, (2)>, <diamond, (2)>, <even_vertical, (0)>, <right, (0)>}
“a red dia mond to the right of the green cir cle”
27. The Lexical Analysis Unit
New <feature tuple set, description> pairs are
derived
<event, description>#1
{<red, (1)>, <circle, (1)>, <pink, (2)>, <cross, (2)>, <above, (0)>, <right, (0)>}
“a pink cross to the u pper right of the red cir cle”
<event, description>#2
{<green, (1)>, <circle, (1)>, <red, (2)>, <diamond, (2)>, <even_vertical, (0)>, <right, (0)>}
“a red dia mond to the right of the green cir cle”
<{<red, (1)>, <circle, (1)>, <right, (0)>}, “a”> <{<circle, (1)>, <red, (2)>, <right, (0)>}, “a”>
<{<red, (1)>, <circle, (1)>, <right, (0)>}, “to the”> <{<circle, (1)>, <red, (2)>, <right, (0)>}, “red”>
<{<red, (1)>, <circle, (1)>, <right, (0)>}, “right of <{<circle, (1)>, <red, (2)>, <right, (0)>}, “to the”>
the”> <{<circle, (1)>, <red, (2)>, <right, (0)>}, “right of
<{<red, (1)>, <circle, (1)>, <right, (0)>}, “red”> the”>
<{<red, (1)>, <circle, (1)>, <right, (0)>}, “cir cle”> <{<circle, (1)>, <red, (2)>, <right, (0)>}, “cir cle”>
28. The Lexical Analysis Unit
• Cross-situational analysis can produce pairs that share
the same strings (homonyms) or the same feature sets
(synonyms)
• Homonyms and synonyms are removed, following a
principle of mutual exclusivity (Markman and
Wachtel, 1988)
• When pairs are equal, with insensitivity to object
identifiers, they are merged. Merging produces a new
pair, that expresses both of the relationships
<{<red, (1)>}, “red”> is merged with
<{<red, (2)>}, “red”> to produce
<{<red, (1, 2)>}, “red”>
29. The Lexical Analysis Unit
• From all merged pairs, homonyms are removed by
selecting the most probable feature set for each
string, Frequency of (Sj | Fi )
P(Fi | Sj ) =
Frequency of Sj
where Frequency of (Sj | Fi) is the number of times that Sj has been
observed with Fi and the Frequency of Sj is the number of times that Sj
has been observed
• Then synonyms are removed by selecting the most
probable string for each feature set, P(Sj | Fi), and
erasing the remaining pair’s feature sets
• A set of lexical items are derived
30. The Syntactic Analysis Unit
• Compositional relationships are found by
combining and comparing lexical items
• Lexical items are combined by set union and
string concatenation
f 1, s1 combined with f 2, s 2 = f 1 f 2, s1 + s 2
• The lexical item triple <<f1, s1>, <f2, s2>, <f3, s3>>
expresses a compositional relationship iff
<f1, s1> = <f2, s2> combined with <f3, s3>
31. The Syntactic Analysis Unit
A lexical item triple can be made to express a
rule by:
1. Converting lexical items into phrasal categories
2. Constructing transformations
32. The Syntactic Analysis Unit
A lexical item triple can be made to express a
rule by:
1. Converting lexical items into phrasal categories
2. Constructing transformations
<<{<red, (1, 2)>, <square, (1, 2)>}, “red square”>,
<{<red, (1, 2)>}, “red”>, <{<square, (1, 2)>}, “square”>>
33. The Syntactic Analysis Unit
A lexical item triple can be made to express a
rule by:
1. Converting lexical items into phrasal categories
2. Constructing transformations
<<{<red, (1, 2)>, <square, (1, 2)>}, “red square”>,
<{<red, (1, 2)>}, “red”>, <{<square, (1, 2)>}, “square”>>
<{<red, (1, 2)>, <square, (1, 2)>}, “red square”>
<{<red, (1, 2)>}, “red”> <{<square, (1, 2)>}, “square”>
34. The Syntactic Analysis Unit
A lexical item triple can be made to express a
rule by:
1. Converting lexical items into phrasal categories
2. Constructing transformations
<<{<red, (1, 2)>, <square, (1, 2)>}, “red square”>,
<{<red, (1, 2)>}, “red”>, <{<square, (1, 2)>}, “square”>>
<{<red, (1, 2)>, <square, (1, 2)>}, “red square”>
(<(1, 2) → (1, 2)>) (<(1, 2) → (1, 2)>)
<{<red, (1, 2)>}, “red”> <{<square, (1, 2)>}, “square”>
35. The Syntactic Analysis Unit
Rules that modify object identifiers can be
constructed
<{<red, (2)>}, “a red”>
() (<(1, 2) → (2)>)
<{}, “a”> <{<red, (1, 2)>}, “red”>
<{<blue, (1)>}, “the blue”>
() (<(1, 2) → (1)>)
<{}, “the”> <{<blue, (1, 2)>}, “blue”>
39. Comprehension
• The model is tested for evidence of language
acquisition through comprehension tasks
• The model can comprehend a string by:
– Finding it in a phrasal category (lexical)
– Or creating it through applying a rule (syntactic)
54. Results
• Observing the developmental shift from lexical to syntactic comprehension
– Tested for comprehension of colours (10), shapes (10), and colour shape combinations
(100) during training. Results are averaged over 10 sessions.
Developmental Shift
25
3
3
3
3
8
Pre-linguistic
Holophrastic
20
Multi-word
Early
comprehended
No. strings
15 Lexical
10 Syntactic
5
0
0 3 6 9 12 15 18 21 24 27 30
No. <event, description>s entered
55. Results
• Comprehension of colours and shapes compared to colour shape
combinations
– Tested for lexical comprehension of colours (10), and shapes (10), and syntactic
comprehension of colour shape combinations (100) during training. Results are averaged
over 10 sessions.
Expressivity
f
100
% string set
80
comprehended
60 Lexical
40 Syntactic
20
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65
No. <event, description>s entered
56. Conclusions
The model demonstrates staged linguistic
acquisition
– No maturational triggers are employed
– Training data are kept constant
– Lexical items are required before compositions can
be derived
57. Conclusions
The model demonstrates staged linguistic
acquisition
– No maturational triggers are employed
– Training data are kept constant
– Lexical items are required before compositions can
be derived
Can this work be extended into further stage
transitions?