Effectiveness of Gamesourcing Expert Painting Annotations1. Effectiveness of Gamesourcing
Expert Painting Annotations
Are there features of images or
subject types that can predict
high or low agreement?
?
Start to play!
Can a simplified version
of an expert annotation
task be carried out by
non-experts?
baseline #
imperfect #
200
400
600
800
1000
1200
numberofannotations(bars)
020406080100
users
percentageofcorrectannotations(dots)
baseline %
imperfect %
baseline #
imperfect #
1
10
100
1000
numberofannotations(bars)
2 4 6 8 10
020406080100
number of repetitions
percentageofcorrectannotations(lines)
baseline %
imperfect %
Do users learn
to correctly label
subject types of
paintings?
?
Can they apply what
they have learned to
new paintings of
known subject types?
?
2
1
7
2
1
1
2
1
3
3
8
3
2
5
3
1
30
1
3
1
1
1
37
1
4
8
12
1
6
7
1
1
8
figu
land
full
port
alle
half
genr
hist
kach
city
seas
stil
anim
town
flow
mari
maes
othe figu land full port alle half genr hist kach city seas stil anim town flow mari maes
Non−Experts
Experts
0
25
50
75
100
Percent
baseline condition − aggregated annotations
96
11
4
1
6
3
1
1
6
1
3
1
1
3
9
3
7
1
2
1
2
1
3
2
6
1
2
1
4
2
1
1
1
23
2
3
19
3
3
1
1
12
1
11
5
1
1
5
1
1
4
6othe
figu
land
full
port
alle
half
genr
hist
kach
city
seas
stil
anim
town
flow
mari
maes
othe figu land full port alle half genr hist kach city seas stil anim town flow mari maes
Non−Experts
Experts
0
25
50
75
100
Percent
imperfect condition − aggregated annotations
48
6
4
8
5
48
4
1
26
6
5
5
5
6
26
38
164
2
27
1
12
39
35
5
1
129
51
34
3
1
1
11
49
2
1
1
29
13
47
1
1
1
3
1
107
3
1
2
2
1
1
286
1
8
16
1
1
2
6
105
2
86
1
2
20
2
203
3
12
1
3
2
53
7
1
1
2
9
6
11
1
1
27
5
1
1
3
1
2
3
846
5
23
8
4
58
1
16
3
1
2
95
2
1
2
77
32
15
15
1
1
2
30
980
4
16
1
27
10
5
9
1
86
6
2
1
9
2
3
6
1
4
20
2
3
136
3
1
6
18
9
3
2
355
18
2
28
4
13
2
5
2
1
86
1
17
6
132
29
86
1
2
3
45
2
21
12
18
1
13
1
5
3
164
1
14
2
7
1
figu
land
full
port
alle
half
genr
hist
kach
city
seas
stil
anim
town
flow
mari
maes
othe figu land full port alle half genr hist kach city seas stil anim town flow mari maes
Non−Experts
Experts
0
25
50
75
Percent
baseline condition − individual annotations
291
63
8
7
5
52
10
9
6
34
4
29
14
8
13
65
7
3
1
59
2
20
10
9
2
7
2
1
8
3
4
2
13
2
9
5
32
8
2
1
1
60
1
1
1
1
2 6
12
1
1
10
35
1
8
2
2
1
1
1
10
4
1
1
3
3
4
1
6
1
7
5
1
1
1
176
20
1
3
3
30
6
1
6
166
3
7
1
6
1
7
18
6
38
1
4
1
1
3
4
6
3
1
10
4
1
89
1
1
6
1
2
1
62
3
1
7
23
10
4
1
1
1
3 26
3
1
25
2
9
2
5
4
5
31
25
2
1
4
2
othe
figu
land
full
port
alle
half
genr
hist
kach
city
seas
stil
anim
town
flow
mari
maes
othe figu land full port alle half genr hist kach city seas stil anim town flow mari maes
Non−Experts
Experts
0
25
50
75
Percent
imperfect condition − individual annotations
How do they compare with
experts, both, individually
and as a crowd?
?
Top players:
1. Myriam C. Traub
2. Jacco van Ossenbruggen
3. Jiyin He
4. Lynda Hardman
!
Label paintings
with subject types
from the Art and
Architecture
Thesaurus!
Game over! Congratulations!
You found out that our results show a notable agreement between experts and
non-experts, that users improve when playing on “perfect” data, and that
aggregating annotations increases their precision. Future research will focus on
peer-feedback and using judgements to improve the selection of candidates.
baseline #
imperfect #
0
50
100
150
200
250
300
350
numberofannotations(bars)
sequence number of new images
percentageofcorrectannotations(lines)
baseline %
imperfect %
[1,20] (40,60] (80,100] (120,140] (160,180] (200,220] (240,260] (280,300] (320,340] (360,380]
020406080100