CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic Interpretation
1. CrowdTruth.org
Machine-‐Human
Computa7on
for
Harnessing
Disagreement
in
Seman7c
Interpreta7on
Oana
Inel,
Khalid
Khamkham,
Ta0ana
Cristea,
Anca
Dumitrache,
Arne
Rutjes
,
Jelle
v.d
Ploeg
,
Lukasz
Romaszko,
Lora
Aroyo,
Robert-‐Jan
Sips
2. Importance
of
Human
Annota7on
• Seman7c
interpreta7on
of
data
is
needed
in
all
sciences
• Humans
analyze
examples
and
annotate
them
for
the
“correct”
interpreta7on
• Machines
learn
&
are
evaluated
from
those
examples
Lora Aroyo @laroyo
4. disagreement
can
reflect
the
degree
of
clarity
in
a
sentence
Does each sentence express the TREAT relation?
ANTIBIOTICS are the first line treatment for indications of TYPHUS.
à 95%
Patients with TYPHUS who were given ANTIBIOTICS exhibited side-effects.
à 80%
With ANTIBIOTICS in short supply, DDT was used during WWII to control
the insect vectors of TYPHUS.
à 50%
Lora Aroyo @laroyo
5. disagreement
can
indicate
ambiguity
of
the
rela7on
What is the RELATION between the highlighted terms?
GADOLINIUM agents are useful for patients with renal impairment, but in
patients with severe renal failure requiring dialysis it presents a risk of
nephrogenic systemic FIBROSIS.
CAUSE? or SIDE EFFECT?
70% 45%
Lora Aroyo @laroyo
6. disagreement
can
indicate
low
quality
workers
Does each sentence express the TREAT relation?
• S1: ANTIBIOTICS are the first line treatment for indications of TYPHUS.
• S2: QUININE is not a reliable cure for MALARIA.
Worker
S1
S2
Worker 1
yes
no
Worker 2
yes
no
Worker 3
yes
Worker 4
no
Worker 5
no
yes
Lora Aroyo @laroyo
7. CrowdTruth
SoJware
Components:
Machines
&
Crowds
Workflow
• Machine Pre-processing:
op0mizing
crowdsourcing
• Micro-task Template Library:
reuse
&
op0miza0on
• CrowdTruth Analytics:
disagreement-‐based
metrics
• Novel
approach
to
ground
truth
data
collec0on
&
evalua0on
• PROV
for
tracking
versions
of
data
and
processing
steps
• Reusability
in
variety
of
annota7on
tasks
&
domains
with
text,
image,
video
(thinking
about
sound)
Lora Aroyo @laroyo
8. • Open
CrowdTruth
SoJware
source:
hQps://github.com/CrowdTruth
• Web
service:
hQp://stable.crowdtruth.org
Lora Aroyo @laroyo
9. • Open
CrowdTruth
SoJware:
Crowdsourcing
Job
Analy7cs
source:
hQps://github.com/CrowdTruth
• Web
service:
hQp://stable.crowdtruth.org
Lora Aroyo @laroyo
10. • Open
CrowdTruth
SoJware:
Worker
Analy7cs
source:
hQps://github.com/CrowdTruth
• Web
service:
hQp://stable.crowdtruth.org
Lora Aroyo @laroyo