SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
Crowds & Niches Teaching Machines to Diagnose
Crowds & Niches 
Teaching Machines to Diagnose	

Lora Aroyo
Crowds  Niches Teaching Machines to Diagnose
IBM Confidential
•  Open Domain Question-Answering Machine, that given
– Rich Natural Language Questions
– Over a Broad Domain of Knowledge
•  Won a 2-game Jeopardy match against the all-time winners
–  viewed by over 50,000,000
Crowds  Niches Teaching Machines to Diagnose
Watson MD
•  Adapt Watson to Medical QA
•  Mainly an NLP task
•  Cognitive computing systems need human-
annotated data for training, testing, evaluation	

	

the human annotation task is one of semantic
interpretation	

Now answering
medical questions!
Crowds  Niches Teaching Machines to Diagnose
Gadolinium agents are useful for patients with renal
impairment, but in patients with severe renal failure
requiring dialysis it presents a risk of nephrogenic
systemic fibrosis.
Mention detection: find the spans (begin, end) of relevant medical
terms (factors) in a passage.
Factor Typing: find the type of each mention
substance disorder
disorder
NER
disorder
treatment
NLP Tasks
Crowds  Niches Teaching Machines to Diagnose
NLP Tasks
Gadolinium agents are useful for patients with renal
impairment, but in patients with severe renal failure
requiring dialysis it presents a risk of nephrogenic
systemic fibrosis.
Mention detection: find the spans (begin, end) of relevant medical terms
(factors) in a passage.
Factor Typing: find the type of each mention
Factor (Entity) Identification: find the corresponding ids for a mentioned
factor in a knowledge-base
C0016911
C1408325
C0035078
C1619692
C0019004
NLP Tasks
Crowds  Niches Teaching Machines to Diagnose
Gadolinium agents are useful for patients with renal
impairment, but in patients with severe renal failure
requiring dialysis it presents a risk of nephrogenic
systemic fibrosis.
Mention detection: find the spans (begin, end) of relevant medical terms
(factors) in a passage.
Factor Typing: find the type of each mention
Factor (Entity) Identification: find the corresponding ids for a mentioned
factor in a knowledge-base
Relation detection: find relations that are expressed in a passage between
factors?
cause
treats
treats
contra-
indicates
NLP Tasks
Crowds  Niches Teaching Machines to Diagnose
NLP Tasks
Gadolinium agents are useful for patients with renal
impairment, but in patients with severe renal failure
requiring dialysis it presents a risk of nephrogenic
systemic fibrosis.
Mention detection: find the spans (begin, end) of relevant medical terms
(factors) in a passage.
Factor Typing: find the type of each mention
Factor (Entity) Identification: find the corresponding ids for a mentioned
factor in a knowledge-base
Relation detection: find relations that are expressed in a passage between
factors?
Coreference: Find the mentions in a sentence that refer to the same factor.
Crowds  Niches Teaching Machines to Diagnose
Gold Standard
Assumption
•  Cognitive systems need to be told what is right  what is wrong	

•  A gold standard or ground truth	

•  Performance is measured on test sets vetted by human experts à
never perfect, always improving against test data	

•  Historically, gold standards are created assuming that for each annotated
instance there is a single right answer	

•  Gold standard quality is measured in inter-annotator agreement à
does not account for perspectives, for reasonable alternative interpretations
Crowds  Niches Teaching Machines to Diagnose
but people don’t always agree…
Crowds  Niches Teaching Machines to Diagnose
Disagreement
Gadolinium agents are useful for patients with renal
impairment, but in patients with severe renal failure
requiring dialysis there is a risk of nephrogenic
systemic fibrosis.
cause
Crowds  Niches Teaching Machines to Diagnose
Gadolinium agents are useful for patients with renal
impairment, but in patients with severe renal failure
requiring dialysis there is a risk of nephrogenic
systemic fibrosis.
side-effect
The human annotation task is one
of semantic interpretation
Disagreement
Crowds  Niches Teaching Machines to Diagnose
Position
maybe this disagreement is a signal and not noise?
can we harness it?
Crowds  Niches Teaching Machines to Diagnose
Key Question
How do we represent 
measure disagreement in a
way that it can be harnessed?
Crowds  Niches Teaching Machines to Diagnose
Crowd Truth
Annotator disagreement is signal,
not noise.
It is indicative of the variation in
human semantic interpretation of
signs, and can indicate ambiguity,
vagueness, over-generality, etc.
http://www.freefoto.com/preview/01-47-44/Flock-of-Birds
Crowds  Niches Teaching Machines to Diagnose
Position
symbiosis between humans  machines
machines learn from humans  machine help humans
Crowds  Niches Teaching Machines to Diagnose
Crowd Truth Framework
Crowds  Niches Teaching Machines to Diagnose
Human-Machine Workflows
Crowds  Niches Teaching Machines to Diagnose
Relation Extraction
Crowdsourcing Ground Truth Data: CrowdTruth
Relations overlap in meaning
Sentences are vague and ambiguous
Experts have different interpretations
Crowds  Niches Teaching Machines to Diagnose
Crowds  Niches Teaching Machines to Diagnose
Representation
Worker Vector
1	

 1	

 1	

Gadolinium agents are useful for patients with renal
impairment, but in patients with severe renal failure
requiring dialysis there is a risk of nephrogenic systemic
fibrosis.
Crowds  Niches Teaching Machines to Diagnose
Representation
Sentence Vector
1	

 1	

 1	

1	

 1	

1	

1	

 1	

1	

 1	

1	

 1	

1	

1	

1	

0	

 1	

 1	

 0	

 0	

 4	

 3	

 0	

 0	

 5	

 1	

 0
Crowds  Niches Teaching Machines to Diagnose
Feeling the way the CHEST expands (PALPATION), can identify areas of
the lung that are full of fluid.
?PALPATIONIs CHEST related to
diagnose location associated
with
is_a otherpart_of
0 0 02 3 0 0 0 1 0 0 44 1
Disagreement for
Sentence Clarity
Unclear relationship between the two arguments
reflected in the disagreement
Crowds  Niches Teaching Machines to Diagnose
?CONJUNCTIVITISHYPERAEMIA related toIs
0 0 0 1 0 0 0 013 0 0 0 0 0
symptomcause
Redness (HYPERAEMIA), irritation (chemosis) and watering (epiphora)
of the eyes are symptoms common to all forms of CONJUNCTIVITIS.
Disagreement for
Sentence Clarity
Clearly expressed relation between the two
arguments reflected in the agreement
Crowds  Niches Teaching Machines to Diagnose
Sentence-Relation
Score
Measures how clearly a sentence
expresses a relation
0	

 1	

 1	

 0	

 0	

 4	

 3	

 0	

 0	

 5	

 1	

 0	

Unit vector for
relation R6	

Sentence
Vector	

Cosine = .55
Crowds  Niches Teaching Machines to Diagnose
Worker Disagreement
Measured per worker
Worker-sentence disagreement
0	

 1	

 1	

 0	

 0	

 4	

 3	

 0	

 0	

 5	

 1	

 0	

Worker’s
sentence vector	

Sentence
Vector	

AVG (Cosine)
Crowds  Niches Teaching Machines to Diagnose
Crowd Truth Metrics
Relation Extraction
Three parts to understand human interpretations:
§  Sentence
•  How good is a sentence for relation extraction task?
§  Workers
•  How well does a worker understand the sentence?
§  Relations
•  Is the meaning of the relation clear?
•  How ambiguous/confusable is it?
Crowds  Niches Teaching Machines to Diagnose
Human-Machine Workflows
Crowds  Niches Teaching Machines to Diagnose
Crowdtruth.org
Crowds  Niches Teaching Machines to Diagnose
Crowdtruth.org
Crowds  Niches Teaching Machines to Diagnose
Provenance of Crowdsourcing
Crowds  Niches Teaching Machines to Diagnose
Watson MD
• Not every task is suitable for
lay crowd, some require
domain expertise
• Domain experts are busy
• How to get them motivated
to perform annotation tasks?
• How to make it efficient for
them and effective for
annotations?
Crowd vs. Experts
Crowds  Niches Teaching Machines to Diagnose
Dr. Watson Experts Game
Crowds  Niches Teaching Machines to Diagnose
Dr. Watson Experts Game
Crowds  Niches Teaching Machines to Diagnose
Dr. Watson Experts Game
Crowds  Niches Teaching Machines to Diagnose
Dr. Watson Experts Game
Crowds  Niches Teaching Machines to Diagnose
Dr. Watson Experts Game
Crowds  Niches Teaching Machines to Diagnose
Dr. Watson Experts Game
Crowds  Niches Teaching Machines to Diagnose
•  Experimenting with:	

•  different domains, e.g. art, history, news	

•  different formats, e.g. text, images, videos	

•  different annotation tasks, e.g. 	

•  medical factors, relations, synonyms, negation	

•  events, event types, participants, locations 	

•  flowers, birds	

•  Integrating crowds from mTurk and CrowdFlower
with domain experts from Dr. Detective, Waisda? and
Accurator	

Domain Independent
Crowds  Niches Teaching Machines to Diagnose
The Crew
•  Lora Aroyo (VU)
•  Chris Welty (IBM)
•  Robert-Jan Sips (IBM)
•  Anca Dumitrache (VU)
•  Oana Inel (VU)
•  Khalid Khamkham (VU)
•  Tatiana Cristea (VU)
• Rens v. Honschooten (VU)
• Benjamin Timmermans (VU)
• Harriëtte Smook (VU)
• Arne Rutjes (IBM)
• Jelle van der Ploeg (IBM)
Crowds  Niches Teaching Machines to Diagnose
http://crowdtruth.org

Contenu connexe

Tendances

Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...CrowdTruth
 
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionCCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionLora Aroyo
 
CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)Lora Aroyo
 
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...Lora Aroyo
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Lora Aroyo
 
Introduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumIntroduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumFuming Shih
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Diana Maynard
 
Language of Politics on Twitter - 03 Analysis
Language of Politics on Twitter - 03 AnalysisLanguage of Politics on Twitter - 03 Analysis
Language of Politics on Twitter - 03 AnalysisYelena Mejova
 
06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguards06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguardsdnac
 
09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memorydnac
 
02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collectiondnac
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Farida Vis
 
ARM'08 - Keynote Talk
ARM'08 - Keynote TalkARM'08 - Keynote Talk
ARM'08 - Keynote TalkLicia Capra
 
03 Ego Network Analysis
03 Ego Network Analysis03 Ego Network Analysis
03 Ego Network Analysisdnac
 
Trust and Recommender Systems
Trust and  Recommender SystemsTrust and  Recommender Systems
Trust and Recommender Systemszhayefei
 
Twitter Data Analytics
Twitter Data AnalyticsTwitter Data Analytics
Twitter Data Analyticsrupika08
 

Tendances (20)

Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
Utilizing Social Health Websites for Cognitive Computing and Clinical Decisio...
 
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing SessionCCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
 
CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)
 
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic...
 
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
Truth is a Lie: Rules & Semantics from Crowd Perspectives (RR'2015 Keynote)
 
Introduction to Bayesian Truth Serum
Introduction to Bayesian Truth SerumIntroduction to Bayesian Truth Serum
Introduction to Bayesian Truth Serum
 
Observational studies in social media
Observational studies in social mediaObservational studies in social media
Observational studies in social media
 
Natural experiments
Natural experimentsNatural experiments
Natural experiments
 
User and Usage of Reddit
User and Usage of RedditUser and Usage of Reddit
User and Usage of Reddit
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...
 
Language of Politics on Twitter - 03 Analysis
Language of Politics on Twitter - 03 AnalysisLanguage of Politics on Twitter - 03 Analysis
Language of Politics on Twitter - 03 Analysis
 
06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguards06 Network Study Design: Ethical Considerations and Safeguards
06 Network Study Design: Ethical Considerations and Safeguards
 
09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory09 Respondent Driven Sampling and Network Sampling with Memory
09 Respondent Driven Sampling and Network Sampling with Memory
 
02 Network Data Collection
02 Network Data Collection02 Network Data Collection
02 Network Data Collection
 
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
Twitter analytics: some thoughts on sampling, tools, data, ethics and user re...
 
ARM'08 - Keynote Talk
ARM'08 - Keynote TalkARM'08 - Keynote Talk
ARM'08 - Keynote Talk
 
03 Ego Network Analysis
03 Ego Network Analysis03 Ego Network Analysis
03 Ego Network Analysis
 
Trust and Recommender Systems
Trust and  Recommender SystemsTrust and  Recommender Systems
Trust and Recommender Systems
 
Pydata Taipei 2020
Pydata Taipei 2020Pydata Taipei 2020
Pydata Taipei 2020
 
Twitter Data Analytics
Twitter Data AnalyticsTwitter Data Analytics
Twitter Data Analytics
 

Similaire à Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities projects 2014

Statistical Research And Inferential Research Methods Paper
Statistical Research And Inferential Research Methods PaperStatistical Research And Inferential Research Methods Paper
Statistical Research And Inferential Research Methods PaperToya Shamberger
 
Write A Definition Essay.pdf
Write A Definition Essay.pdfWrite A Definition Essay.pdf
Write A Definition Essay.pdfJenny Jones
 
Pin On Sample Sop For Masters In Engineering Ma
Pin On Sample Sop For Masters In Engineering MaPin On Sample Sop For Masters In Engineering Ma
Pin On Sample Sop For Masters In Engineering MaCarla Potier
 
Lecture3-AI-Applications-In-Medicine-January23-2023.pptx
Lecture3-AI-Applications-In-Medicine-January23-2023.pptxLecture3-AI-Applications-In-Medicine-January23-2023.pptx
Lecture3-AI-Applications-In-Medicine-January23-2023.pptxUmayKulsoom2
 
Attachment Anxiety Essay
Attachment Anxiety EssayAttachment Anxiety Essay
Attachment Anxiety EssayPatricia Adams
 
Essay Writing Types And Format. Essay Writing. 202
Essay Writing Types And Format. Essay Writing. 202Essay Writing Types And Format. Essay Writing. 202
Essay Writing Types And Format. Essay Writing. 202Stephen Faucher
 
ROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface Lumps
ROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface LumpsROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface Lumps
ROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface LumpsReynaldo Joson
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationBenjamin Good
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in HealthcareDavid Talby
 
Mind Palace Research Paper
Mind Palace Research PaperMind Palace Research Paper
Mind Palace Research PaperNicolle Dammann
 
Puzzle Of Dreams Research Paper
Puzzle Of Dreams Research PaperPuzzle Of Dreams Research Paper
Puzzle Of Dreams Research PaperJennifer Lopez
 
Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...
Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...
Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...Elizabeth Montes
 
Facial Expressions, Personal Appearance, Detecting Deception
Facial Expressions, Personal Appearance, Detecting DeceptionFacial Expressions, Personal Appearance, Detecting Deception
Facial Expressions, Personal Appearance, Detecting DeceptionMelanie Erickson
 
Week 1 Flow And Error 10
Week 1 Flow And Error 10Week 1 Flow And Error 10
Week 1 Flow And Error 10mtfinn
 
Week 1 Flow And Error 10
Week 1 Flow And Error 10Week 1 Flow And Error 10
Week 1 Flow And Error 10mtfinn
 
Chapter 1 (writing sample)
Chapter 1 (writing sample)Chapter 1 (writing sample)
Chapter 1 (writing sample)Casey Rutten
 
Pallidal Striatal Pathway
Pallidal Striatal PathwayPallidal Striatal Pathway
Pallidal Striatal PathwaySue Jones
 
Substantive Scientific Testimony
Substantive Scientific TestimonySubstantive Scientific Testimony
Substantive Scientific TestimonySusan Kennedy
 

Similaire à Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities projects 2014 (20)

Statistical Research And Inferential Research Methods Paper
Statistical Research And Inferential Research Methods PaperStatistical Research And Inferential Research Methods Paper
Statistical Research And Inferential Research Methods Paper
 
Write A Definition Essay.pdf
Write A Definition Essay.pdfWrite A Definition Essay.pdf
Write A Definition Essay.pdf
 
Pin On Sample Sop For Masters In Engineering Ma
Pin On Sample Sop For Masters In Engineering MaPin On Sample Sop For Masters In Engineering Ma
Pin On Sample Sop For Masters In Engineering Ma
 
Lecture3-AI-Applications-In-Medicine-January23-2023.pptx
Lecture3-AI-Applications-In-Medicine-January23-2023.pptxLecture3-AI-Applications-In-Medicine-January23-2023.pptx
Lecture3-AI-Applications-In-Medicine-January23-2023.pptx
 
Attachment Anxiety Essay
Attachment Anxiety EssayAttachment Anxiety Essay
Attachment Anxiety Essay
 
Essay Writing Types And Format. Essay Writing. 202
Essay Writing Types And Format. Essay Writing. 202Essay Writing Types And Format. Essay Writing. 202
Essay Writing Types And Format. Essay Writing. 202
 
ROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface Lumps
ROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface LumpsROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface Lumps
ROJoson PEP Talk: Clinical Diagnostic Algorithm for Common Surface Lumps
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
 
Cybernetic occlusion
Cybernetic occlusionCybernetic occlusion
Cybernetic occlusion
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in Healthcare
 
Mind Palace Research Paper
Mind Palace Research PaperMind Palace Research Paper
Mind Palace Research Paper
 
Lesson 20
Lesson 20Lesson 20
Lesson 20
 
Puzzle Of Dreams Research Paper
Puzzle Of Dreams Research PaperPuzzle Of Dreams Research Paper
Puzzle Of Dreams Research Paper
 
Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...
Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...
Essay On Concentration Camps. . An Open Letter to the Director of the US Holo...
 
Facial Expressions, Personal Appearance, Detecting Deception
Facial Expressions, Personal Appearance, Detecting DeceptionFacial Expressions, Personal Appearance, Detecting Deception
Facial Expressions, Personal Appearance, Detecting Deception
 
Week 1 Flow And Error 10
Week 1 Flow And Error 10Week 1 Flow And Error 10
Week 1 Flow And Error 10
 
Week 1 Flow And Error 10
Week 1 Flow And Error 10Week 1 Flow And Error 10
Week 1 Flow And Error 10
 
Chapter 1 (writing sample)
Chapter 1 (writing sample)Chapter 1 (writing sample)
Chapter 1 (writing sample)
 
Pallidal Striatal Pathway
Pallidal Striatal PathwayPallidal Striatal Pathway
Pallidal Striatal Pathway
 
Substantive Scientific Testimony
Substantive Scientific TestimonySubstantive Scientific Testimony
Substantive Scientific Testimony
 

Plus de Lora Aroyo

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfLora Aroyo
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningLora Aroyo
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Lora Aroyo
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AILora Aroyo
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumLora Aroyo
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorLora Aroyo
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataLora Aroyo
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumLora Aroyo
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsLora Aroyo
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesLora Aroyo
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the LoopLora Aroyo
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoLora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital AgeLora Aroyo
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to SnapchatLora Aroyo
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyLora Aroyo
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...Lora Aroyo
 

Plus de Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AI
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 

Dernier

The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4DianaGray10
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.IPLOOK Networks
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptxHansamali Gamage
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1DianaGray10
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosErol GIRAUDY
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingMAGNIntelligence
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2DianaGray10
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 
Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfInfopole1
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTopCSSGallery
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FESTBillieHyde
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameKapil Thakar
 

Dernier (20)

The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenarios
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced Computing
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 
Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdf
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development Companies
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 
Technical SEO for Improved Accessibility WTS FEST
Technical SEO for Improved Accessibility  WTS FESTTechnical SEO for Improved Accessibility  WTS FEST
Technical SEO for Improved Accessibility WTS FEST
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First Frame
 

Crowds & Niches Teaching Machines to Diagnose: NLeSC Kick off eHumanities projects 2014

  • 1. Crowds & Niches Teaching Machines to Diagnose Crowds & Niches Teaching Machines to Diagnose Lora Aroyo
  • 2. Crowds Niches Teaching Machines to Diagnose IBM Confidential •  Open Domain Question-Answering Machine, that given – Rich Natural Language Questions – Over a Broad Domain of Knowledge •  Won a 2-game Jeopardy match against the all-time winners –  viewed by over 50,000,000
  • 3. Crowds Niches Teaching Machines to Diagnose Watson MD •  Adapt Watson to Medical QA •  Mainly an NLP task •  Cognitive computing systems need human- annotated data for training, testing, evaluation the human annotation task is one of semantic interpretation Now answering medical questions!
  • 4. Crowds Niches Teaching Machines to Diagnose Gadolinium agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis it presents a risk of nephrogenic systemic fibrosis. Mention detection: find the spans (begin, end) of relevant medical terms (factors) in a passage. Factor Typing: find the type of each mention substance disorder disorder NER disorder treatment NLP Tasks
  • 5. Crowds Niches Teaching Machines to Diagnose NLP Tasks Gadolinium agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis it presents a risk of nephrogenic systemic fibrosis. Mention detection: find the spans (begin, end) of relevant medical terms (factors) in a passage. Factor Typing: find the type of each mention Factor (Entity) Identification: find the corresponding ids for a mentioned factor in a knowledge-base C0016911 C1408325 C0035078 C1619692 C0019004 NLP Tasks
  • 6. Crowds Niches Teaching Machines to Diagnose Gadolinium agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis it presents a risk of nephrogenic systemic fibrosis. Mention detection: find the spans (begin, end) of relevant medical terms (factors) in a passage. Factor Typing: find the type of each mention Factor (Entity) Identification: find the corresponding ids for a mentioned factor in a knowledge-base Relation detection: find relations that are expressed in a passage between factors? cause treats treats contra- indicates NLP Tasks
  • 7. Crowds Niches Teaching Machines to Diagnose NLP Tasks Gadolinium agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis it presents a risk of nephrogenic systemic fibrosis. Mention detection: find the spans (begin, end) of relevant medical terms (factors) in a passage. Factor Typing: find the type of each mention Factor (Entity) Identification: find the corresponding ids for a mentioned factor in a knowledge-base Relation detection: find relations that are expressed in a passage between factors? Coreference: Find the mentions in a sentence that refer to the same factor.
  • 8. Crowds Niches Teaching Machines to Diagnose Gold Standard Assumption •  Cognitive systems need to be told what is right what is wrong •  A gold standard or ground truth •  Performance is measured on test sets vetted by human experts à never perfect, always improving against test data •  Historically, gold standards are created assuming that for each annotated instance there is a single right answer •  Gold standard quality is measured in inter-annotator agreement à does not account for perspectives, for reasonable alternative interpretations
  • 9. Crowds Niches Teaching Machines to Diagnose but people don’t always agree…
  • 10. Crowds Niches Teaching Machines to Diagnose Disagreement Gadolinium agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis there is a risk of nephrogenic systemic fibrosis. cause
  • 11. Crowds Niches Teaching Machines to Diagnose Gadolinium agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis there is a risk of nephrogenic systemic fibrosis. side-effect The human annotation task is one of semantic interpretation Disagreement
  • 12. Crowds Niches Teaching Machines to Diagnose Position maybe this disagreement is a signal and not noise? can we harness it?
  • 13. Crowds Niches Teaching Machines to Diagnose Key Question How do we represent measure disagreement in a way that it can be harnessed?
  • 14. Crowds Niches Teaching Machines to Diagnose Crowd Truth Annotator disagreement is signal, not noise. It is indicative of the variation in human semantic interpretation of signs, and can indicate ambiguity, vagueness, over-generality, etc. http://www.freefoto.com/preview/01-47-44/Flock-of-Birds
  • 15. Crowds Niches Teaching Machines to Diagnose Position symbiosis between humans machines machines learn from humans machine help humans
  • 16. Crowds Niches Teaching Machines to Diagnose Crowd Truth Framework
  • 17. Crowds Niches Teaching Machines to Diagnose Human-Machine Workflows
  • 18. Crowds Niches Teaching Machines to Diagnose Relation Extraction Crowdsourcing Ground Truth Data: CrowdTruth Relations overlap in meaning Sentences are vague and ambiguous Experts have different interpretations
  • 19. Crowds Niches Teaching Machines to Diagnose
  • 20. Crowds Niches Teaching Machines to Diagnose Representation Worker Vector 1 1 1 Gadolinium agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis there is a risk of nephrogenic systemic fibrosis.
  • 21. Crowds Niches Teaching Machines to Diagnose Representation Sentence Vector 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 4 3 0 0 5 1 0
  • 22. Crowds Niches Teaching Machines to Diagnose Feeling the way the CHEST expands (PALPATION), can identify areas of the lung that are full of fluid. ?PALPATIONIs CHEST related to diagnose location associated with is_a otherpart_of 0 0 02 3 0 0 0 1 0 0 44 1 Disagreement for Sentence Clarity Unclear relationship between the two arguments reflected in the disagreement
  • 23. Crowds Niches Teaching Machines to Diagnose ?CONJUNCTIVITISHYPERAEMIA related toIs 0 0 0 1 0 0 0 013 0 0 0 0 0 symptomcause Redness (HYPERAEMIA), irritation (chemosis) and watering (epiphora) of the eyes are symptoms common to all forms of CONJUNCTIVITIS. Disagreement for Sentence Clarity Clearly expressed relation between the two arguments reflected in the agreement
  • 24. Crowds Niches Teaching Machines to Diagnose Sentence-Relation Score Measures how clearly a sentence expresses a relation 0 1 1 0 0 4 3 0 0 5 1 0 Unit vector for relation R6 Sentence Vector Cosine = .55
  • 25. Crowds Niches Teaching Machines to Diagnose Worker Disagreement Measured per worker Worker-sentence disagreement 0 1 1 0 0 4 3 0 0 5 1 0 Worker’s sentence vector Sentence Vector AVG (Cosine)
  • 26. Crowds Niches Teaching Machines to Diagnose Crowd Truth Metrics Relation Extraction Three parts to understand human interpretations: §  Sentence •  How good is a sentence for relation extraction task? §  Workers •  How well does a worker understand the sentence? §  Relations •  Is the meaning of the relation clear? •  How ambiguous/confusable is it?
  • 27. Crowds Niches Teaching Machines to Diagnose Human-Machine Workflows
  • 28. Crowds Niches Teaching Machines to Diagnose Crowdtruth.org
  • 29. Crowds Niches Teaching Machines to Diagnose Crowdtruth.org
  • 30. Crowds Niches Teaching Machines to Diagnose Provenance of Crowdsourcing
  • 31. Crowds Niches Teaching Machines to Diagnose Watson MD • Not every task is suitable for lay crowd, some require domain expertise • Domain experts are busy • How to get them motivated to perform annotation tasks? • How to make it efficient for them and effective for annotations? Crowd vs. Experts
  • 32. Crowds Niches Teaching Machines to Diagnose Dr. Watson Experts Game
  • 33. Crowds Niches Teaching Machines to Diagnose Dr. Watson Experts Game
  • 34. Crowds Niches Teaching Machines to Diagnose Dr. Watson Experts Game
  • 35. Crowds Niches Teaching Machines to Diagnose Dr. Watson Experts Game
  • 36. Crowds Niches Teaching Machines to Diagnose Dr. Watson Experts Game
  • 37. Crowds Niches Teaching Machines to Diagnose Dr. Watson Experts Game
  • 38. Crowds Niches Teaching Machines to Diagnose •  Experimenting with: •  different domains, e.g. art, history, news •  different formats, e.g. text, images, videos •  different annotation tasks, e.g. •  medical factors, relations, synonyms, negation •  events, event types, participants, locations •  flowers, birds •  Integrating crowds from mTurk and CrowdFlower with domain experts from Dr. Detective, Waisda? and Accurator Domain Independent
  • 39. Crowds Niches Teaching Machines to Diagnose The Crew •  Lora Aroyo (VU) •  Chris Welty (IBM) •  Robert-Jan Sips (IBM) •  Anca Dumitrache (VU) •  Oana Inel (VU) •  Khalid Khamkham (VU) •  Tatiana Cristea (VU) • Rens v. Honschooten (VU) • Benjamin Timmermans (VU) • Harriëtte Smook (VU) • Arne Rutjes (IBM) • Jelle van der Ploeg (IBM)
  • 40. Crowds Niches Teaching Machines to Diagnose http://crowdtruth.org