SlideShare une entreprise Scribd logo
1  sur  19
Web Science & Technologies
University of Koblenz ▪ Landau, Germany

Exploring the challenge of
linking scientific publications
and studies with crowd workers
instead of domain experts
Cristina Sarasua
csarasua@uni-koblenz.de
Computational Social Science workshop
Köln, 16.12.2013
Ideal workflow
1

Read publications

2

Access data

3 Reuse data

FOTO

 Peter Schumacher (social scientist) would like to analyse
the voting patterns of Germans in the last 20 years
 Past observations
 New analysis, new findings

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Reality

?

FOTO

 Publications and research data (coming from surveys and
studies) are published independently
 The link between them is missing
 Researchers cannot easily access the research data
WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Scenario
publications

research data (studies)

WeST

Cristina Sarasua

 We need a method to
process publications and
studies in order to be able
to
1. Find references to
studies inside
publications
2. Identify which
publication is connected
to which study
3. Identify the type of
relation
between
publication and study

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Problem

 Computers cannot perform these 3 tasks automatically in a
perfect way

Incorrect link between a
publication and a study

 We need human intervention
 Domain experts are often not available for such kind of
tasks
WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Solution: Crowdsourcing

“The process of outsourcing a task to a (potentially) large and
undefined group of people in an open call“ Jeff Howe, 2006
Microtask crowdsourcing
-Simple and independent tasks
-Paid crowdsourcing
-Online labor marketplaces (e.g. MTurk)
-

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Amazon Mechanical Turk

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Crowdsourced interlinking: the GESIS case study

Researcher

1
SSOAR

Web
portal

Publications

da|ra

InfoLink
links

2

3

CrowdLINK

corrected links
Web
portal

Research data

Hybrid solution
1) Automatic processing of publications and studies
2) Ask crowd workers to review links
- Correct errors
- Identify primary literature / secondary literature
3) Generates Linked Data
WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
How is this related to CSS?

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
On the one hand …

The GESIS case study
In collab with GESIS colleagues
Katarina Boland, Daniel Hienert et al.

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
On the other hand …

How to manage such a
group of people to maximize
their efficiency and make
them happy?

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Open call

 We can impose some restrictions (e.g. language, country,
reputation gained)
Different background

Different motivations

Chart: Ipeirotis, 2010

Different behaviour

2010

 Spam
Charts: Charts Ross et al., 2010

CrowdFlower 11.12.2013
WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
The tasks at hand

 They are not the “most exciting tasks“ of the world
 The data is in German
 The domain is very specific

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
First experiments of the GESIS case study

Adopted measures





Used majority voting
Included verification questions (e.g. “please type the date shown for the
publication“)
Defined gold standard links to check who could be trusted

Highlights of findings





We managed to get trusted workers quite quickly (e.g. 490 links reviewed
in ~24hours) being able to improve the precision of the automatic software
without without loosing considerable recall
The cases which required background knowledge showed worse results
The task of “relating publication and study“ was solved with much better
recall than the task of deciding on “whether a publication is
primaryLiterature or not of a study“. The precision was very high, though.

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Ongoing research work


Can we improve their results by including mixed
incentives? Not only money, but also competition at a
microtask level
there are only X links left, be
quick!“, or „there are three workers
who were faster in reviewing links!



there 3 workers who were faster in
reviewing links!

How can we better instruct crowd workers in 1) the type of
tasks were are running and 2) the domain we are working
with?

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Take-home message

We can employ crowd workers for connecting scientific
publications and studies in the social sciences. It can improve
automatically generated links.
How can we transfer the knowledge of domain
experts to the crowd?

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Call for discussion

 Who?
1. Psychologists
2. Social Scientists
3. Computer scientists
 Possible topics
 Any feedback about the aforementioned ideas
 Well-established methodologies in psychology to instruct
or train a large group of people
 Any suggestion on how to analyse crowd workers (i.e.
criteria)

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts
Thank you.
Vielen Dank.

WeST

Cristina Sarasua

Exploring the challenge of linking scientific publications and
studies with crowd workers instead of domain experts

Contenu connexe

En vedette

Jae Lee flipbook Film 260
Jae Lee flipbook Film 260Jae Lee flipbook Film 260
Jae Lee flipbook Film 260
dlwodud77
 
Exposición
Exposición Exposición
Exposición
blancacr
 
Vol. 1. no. 2. the escalade of power rates. (2008)
Vol. 1. no. 2. the escalade of power rates. (2008)Vol. 1. no. 2. the escalade of power rates. (2008)
Vol. 1. no. 2. the escalade of power rates. (2008)
StratbaseResearchInstitute
 

En vedette (9)

Crowd statement marathon
Crowd statement marathonCrowd statement marathon
Crowd statement marathon
 
Jae Lee flipbook Film 260
Jae Lee flipbook Film 260Jae Lee flipbook Film 260
Jae Lee flipbook Film 260
 
Paper presentations1
Paper presentations1Paper presentations1
Paper presentations1
 
Exposición
Exposición Exposición
Exposición
 
CurriculimVitae
CurriculimVitaeCurriculimVitae
CurriculimVitae
 
How to build an ufo a startup manual for the next generation of company bui...
How to build an ufo   a startup manual for the next generation of company bui...How to build an ufo   a startup manual for the next generation of company bui...
How to build an ufo a startup manual for the next generation of company bui...
 
Vol. 1. no. 2. the escalade of power rates. (2008)
Vol. 1. no. 2. the escalade of power rates. (2008)Vol. 1. no. 2. the escalade of power rates. (2008)
Vol. 1. no. 2. the escalade of power rates. (2008)
 
Ishihara wcan autumn_2013
Ishihara wcan autumn_2013Ishihara wcan autumn_2013
Ishihara wcan autumn_2013
 
Crowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro WorkCrowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro Work
 

Similaire à Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts

Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...
Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...
Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...
Kudos
 
Using Grounded Theory In Research
Using Grounded Theory In ResearchUsing Grounded Theory In Research
Using Grounded Theory In Research
Alyssa Dennis
 
UKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research DataUKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG: connecting the knowledge community
 

Similaire à Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts (20)

Being an Open Scholar in a Connected World
Being an Open Scholar in a Connected WorldBeing an Open Scholar in a Connected World
Being an Open Scholar in a Connected World
 
OpenML Tutorial: Networked Science in Machine Learning
OpenML Tutorial: Networked Science in Machine LearningOpenML Tutorial: Networked Science in Machine Learning
OpenML Tutorial: Networked Science in Machine Learning
 
Research Design: Twitter and professional learning
Research Design: Twitter and professional learningResearch Design: Twitter and professional learning
Research Design: Twitter and professional learning
 
Introduction to organisational research and case studies
Introduction to organisational research and case studiesIntroduction to organisational research and case studies
Introduction to organisational research and case studies
 
020610
020610020610
020610
 
Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...
Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...
Regional Studies Association - Annual Meeting - Dublin 2017: increasing the r...
 
DREaM Event 2: Louise Cooke
DREaM Event 2: Louise CookeDREaM Event 2: Louise Cooke
DREaM Event 2: Louise Cooke
 
Using Grounded Theory In Research
Using Grounded Theory In ResearchUsing Grounded Theory In Research
Using Grounded Theory In Research
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 
A Qualititative Approach To HCI Research
A Qualititative Approach To HCI ResearchA Qualititative Approach To HCI Research
A Qualititative Approach To HCI Research
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of Data
 
Case Study: Using Dissertations Data for Research
Case Study: Using Dissertations Data for ResearchCase Study: Using Dissertations Data for Research
Case Study: Using Dissertations Data for Research
 
Research process and research data management
Research  process and research data managementResearch  process and research data management
Research process and research data management
 
UKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research DataUKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research Data
 
Dbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_openDbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_open
 
Social machines: theory design and incentives
Social machines: theory design and incentivesSocial machines: theory design and incentives
Social machines: theory design and incentives
 
Data Ethics for Mathematicians
Data Ethics for MathematiciansData Ethics for Mathematicians
Data Ethics for Mathematicians
 
Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...Making our mark: the important role of social scientists in the ‘era of big d...
Making our mark: the important role of social scientists in the ‘era of big d...
 
Analysing Qualitative Data
Analysing Qualitative DataAnalysing Qualitative Data
Analysing Qualitative Data
 
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKINGTOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
TOWARDS A MULTI-FEATURE ENABLED APPROACH FOR OPTIMIZED EXPERT SEEKING
 

Plus de Cristina Sarasua

Introduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata EditathonIntroduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata Editathon
Cristina Sarasua
 

Plus de Cristina Sarasua (11)

Editing Behavior over Time Power vs. Standard Wikidata Editors
Editing Behavior over Time  Power vs. Standard Wikidata EditorsEditing Behavior over Time  Power vs. Standard Wikidata Editors
Editing Behavior over Time Power vs. Standard Wikidata Editors
 
Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greater
 
Closing session
Closing sessionClosing session
Closing session
 
Reviews and awards
Reviews and awardsReviews and awards
Reviews and awards
 
Hello session
Hello sessionHello session
Hello session
 
Tecnología e Igualdad
Tecnología e IgualdadTecnología e Igualdad
Tecnología e Igualdad
 
Introduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata EditathonIntroduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata Editathon
 
Interlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAsInterlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAs
 
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
 
Swib2014csarasua
Swib2014csarasuaSwib2014csarasua
Swib2014csarasua
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts

  • 1. Web Science & Technologies University of Koblenz ▪ Landau, Germany Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts Cristina Sarasua csarasua@uni-koblenz.de Computational Social Science workshop Köln, 16.12.2013
  • 2. Ideal workflow 1 Read publications 2 Access data 3 Reuse data FOTO  Peter Schumacher (social scientist) would like to analyse the voting patterns of Germans in the last 20 years  Past observations  New analysis, new findings WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 3. Reality ? FOTO  Publications and research data (coming from surveys and studies) are published independently  The link between them is missing  Researchers cannot easily access the research data WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 4. Scenario publications research data (studies) WeST Cristina Sarasua  We need a method to process publications and studies in order to be able to 1. Find references to studies inside publications 2. Identify which publication is connected to which study 3. Identify the type of relation between publication and study Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 5. Problem  Computers cannot perform these 3 tasks automatically in a perfect way Incorrect link between a publication and a study  We need human intervention  Domain experts are often not available for such kind of tasks WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 6. Solution: Crowdsourcing “The process of outsourcing a task to a (potentially) large and undefined group of people in an open call“ Jeff Howe, 2006 Microtask crowdsourcing -Simple and independent tasks -Paid crowdsourcing -Online labor marketplaces (e.g. MTurk) - WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 7. Amazon Mechanical Turk WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 8. Crowdsourced interlinking: the GESIS case study Researcher 1 SSOAR Web portal Publications da|ra InfoLink links 2 3 CrowdLINK corrected links Web portal Research data Hybrid solution 1) Automatic processing of publications and studies 2) Ask crowd workers to review links - Correct errors - Identify primary literature / secondary literature 3) Generates Linked Data WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 9. How is this related to CSS? WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 10. On the one hand … The GESIS case study In collab with GESIS colleagues Katarina Boland, Daniel Hienert et al. WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 11. On the other hand … How to manage such a group of people to maximize their efficiency and make them happy? WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 12. WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 13. Open call  We can impose some restrictions (e.g. language, country, reputation gained) Different background Different motivations Chart: Ipeirotis, 2010 Different behaviour 2010  Spam Charts: Charts Ross et al., 2010 CrowdFlower 11.12.2013 WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 14. The tasks at hand  They are not the “most exciting tasks“ of the world  The data is in German  The domain is very specific WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 15. First experiments of the GESIS case study Adopted measures    Used majority voting Included verification questions (e.g. “please type the date shown for the publication“) Defined gold standard links to check who could be trusted Highlights of findings    We managed to get trusted workers quite quickly (e.g. 490 links reviewed in ~24hours) being able to improve the precision of the automatic software without without loosing considerable recall The cases which required background knowledge showed worse results The task of “relating publication and study“ was solved with much better recall than the task of deciding on “whether a publication is primaryLiterature or not of a study“. The precision was very high, though. WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 16. Ongoing research work  Can we improve their results by including mixed incentives? Not only money, but also competition at a microtask level there are only X links left, be quick!“, or „there are three workers who were faster in reviewing links!  there 3 workers who were faster in reviewing links! How can we better instruct crowd workers in 1) the type of tasks were are running and 2) the domain we are working with? WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 17. Take-home message We can employ crowd workers for connecting scientific publications and studies in the social sciences. It can improve automatically generated links. How can we transfer the knowledge of domain experts to the crowd? WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 18. Call for discussion  Who? 1. Psychologists 2. Social Scientists 3. Computer scientists  Possible topics  Any feedback about the aforementioned ideas  Well-established methodologies in psychology to instruct or train a large group of people  Any suggestion on how to analyse crowd workers (i.e. criteria) WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts
  • 19. Thank you. Vielen Dank. WeST Cristina Sarasua Exploring the challenge of linking scientific publications and studies with crowd workers instead of domain experts