SlideShare une entreprise Scribd logo
1  sur  29
never trust a

scientist

datajournalist

dataset

	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
missing data, no value stored	
“I need to solve this”	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
missing data, no value stored	
“I need to solve this”	
missing data, no value stored	
“I need to write a story about this”	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
forreporters.com/andrew-lehren/	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
scientist to journalist: “You twist everything”	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
journalist to scientist: “Your articles are useless”	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
 
	
  
“I am right”	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
can I trust (and use) this dataset?	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
“Trustworthiness and data
management are vital to the success of
qualitative studies … There is a lack of
scientific literature regarding the
structures and processes for managing
large qualitative data sets.”	
	
(White, Oelken, Friesen, 2012)	
	
	
   Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
“A simple answer to objective reporting
is the kind of reporting that uses relevant
and reliable sources which is not bias or
slanted to a certain party.”	
	
Ibrahim, Pawanteh, Kee (2011)	
	
	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
question:	
how to validate	
a dataset?	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
check the data source	
	
what are his/her/its intentions?	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
what is the citation index	
of the data owner?	
	
	
do other journalists	
cite the data owner?	
	
	
   Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
check the data	
	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
benefit	
	
do I need this?	
	
	
	
do I need to use it?	
	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
check	
	
data gathering?	
	
	
	
clarification of the data?	
	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
check	
	
data gathering? 	
is this correct?	
	
	
clarification of the data?
do I understand?	
	
	
   Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
missing data	
	
what is wrong? 	
	
	
	
what is the story?	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
missing data	
	
what is wrong? 	
I need to solve	
	
	
what is the story?	
I need to write	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
trouble?	
	
TEST!	
	
	
	
CALL!	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
I need more sources! (do I?)	
	
give me data	
	
	
	
give me humans	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
I need more sources! (do I?)	
	
give me data	
check consistency	
	
	
give me humans	
check my story	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
same steps	
different interpretation	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
“Dear datajournalist,	
	
Please take a look at the
research method yourself
and act a bit more like a
scientist.”	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
“Dear scientist,	
	
Try to avoid intellectual
arrogance. There are
other people who are just
as smart.”	
	
   Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
“practice what you preach”	
	
  
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
scientists	
check the
source
(citation)	
check the
data	
check
benefit	
check data
gathering	
TEST!	
more data
sources	
data journalists	
check the
source
(citation)	
check the
data	
check
benefit	
check
clarification	
CALL!	
more
human
sources	
Tilburg	
  University	
  -­‐	
  data	
  journalism	
  
@Hillevanderkaa	
Tilburg University

Contenu connexe

Similaire à How to validate a dataset? Six steps.

Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Thinkful
 
BioASQ and BDE in SC1.1
BioASQ and BDE in SC1.1BioASQ and BDE in SC1.1
BioASQ and BDE in SC1.1BigData_Europe
 
Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1sasi
 
Watching the workers: researching information behaviours in, and for, workplaces
Watching the workers: researching information behaviours in, and for, workplacesWatching the workers: researching information behaviours in, and for, workplaces
Watching the workers: researching information behaviours in, and for, workplacesHazel Hall
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesChantal van Son
 
Data Journalism - Introduction
Data Journalism - IntroductionData Journalism - Introduction
Data Journalism - IntroductionBahareh Heravi
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
'Drinking from the fire hose? The pitfalls and potential of Big Data'.
'Drinking from the fire hose? The pitfalls and potential of Big Data'.'Drinking from the fire hose? The pitfalls and potential of Big Data'.
'Drinking from the fire hose? The pitfalls and potential of Big Data'.Josh Cowls
 
Science as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey BoultonScience as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey BoultonOpenAIRE
 
Data science and good questions eric kostello
Data science and good questions eric kostelloData science and good questions eric kostello
Data science and good questions eric kostelloData Con LA
 
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)Hendrik Drachsler
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
Critical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataCritical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataUniversity of South Africa (Unisa)
 
Data Science-1 (1).ppt
Data Science-1 (1).pptData Science-1 (1).ppt
Data Science-1 (1).pptSanjayAcharaya
 
Data sharing in the age of the Social Machine
Data sharing in the age of the Social MachineData sharing in the age of the Social Machine
Data sharing in the age of the Social MachineUlrik Lyngs
 
How is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure LiteracyHow is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure LiteracyJonathan Gray
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceWesley Eldridge
 
Data science and ethics in fundraising
Data science and ethics in fundraisingData science and ethics in fundraising
Data science and ethics in fundraisingJames Orton
 

Similaire à How to validate a dataset? Six steps. (20)

Etmaal
EtmaalEtmaal
Etmaal
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Open Data Journalism
Open Data JournalismOpen Data Journalism
Open Data Journalism
 
BioASQ and BDE in SC1.1
BioASQ and BDE in SC1.1BioASQ and BDE in SC1.1
BioASQ and BDE in SC1.1
 
Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1
 
Watching the workers: researching information behaviours in, and for, workplaces
Watching the workers: researching information behaviours in, and for, workplacesWatching the workers: researching information behaviours in, and for, workplaces
Watching the workers: researching information behaviours in, and for, workplaces
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social Sciences
 
Data Journalism - Introduction
Data Journalism - IntroductionData Journalism - Introduction
Data Journalism - Introduction
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
'Drinking from the fire hose? The pitfalls and potential of Big Data'.
'Drinking from the fire hose? The pitfalls and potential of Big Data'.'Drinking from the fire hose? The pitfalls and potential of Big Data'.
'Drinking from the fire hose? The pitfalls and potential of Big Data'.
 
Science as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey BoultonScience as an Open Enterprise – Geoffrey Boulton
Science as an Open Enterprise – Geoffrey Boulton
 
Data science and good questions eric kostello
Data science and good questions eric kostelloData science and good questions eric kostello
Data science and good questions eric kostello
 
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
Ethics and Privacy in the Application of Learning Analytics (#EP4LA)
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Critical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataCritical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) data
 
Data Science-1 (1).ppt
Data Science-1 (1).pptData Science-1 (1).ppt
Data Science-1 (1).ppt
 
Data sharing in the age of the Social Machine
Data sharing in the age of the Social MachineData sharing in the age of the Social Machine
Data sharing in the age of the Social Machine
 
How is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure LiteracyHow is Data Made? From Dataset Literacy to Data Infrastructure Literacy
How is Data Made? From Dataset Literacy to Data Infrastructure Literacy
 
An Obligatory Introduction to Data Science
An Obligatory Introduction to Data ScienceAn Obligatory Introduction to Data Science
An Obligatory Introduction to Data Science
 
Data science and ethics in fundraising
Data science and ethics in fundraisingData science and ethics in fundraising
Data science and ethics in fundraising
 

Plus de Hille van der Kaa MA MBA

Plus de Hille van der Kaa MA MBA (11)

Er was eens... een goed ondernemersverhaal
Er was eens... een goed ondernemersverhaalEr was eens... een goed ondernemersverhaal
Er was eens... een goed ondernemersverhaal
 
Robot Reporters or Human Journalists: Who Do You Trust More?
Robot Reporters or Human Journalists: Who Do You Trust More?Robot Reporters or Human Journalists: Who Do You Trust More?
Robot Reporters or Human Journalists: Who Do You Trust More?
 
Storytelling in a digital age - challenges of a Data Journalist
Storytelling in a digital age - challenges of a Data JournalistStorytelling in a digital age - challenges of a Data Journalist
Storytelling in a digital age - challenges of a Data Journalist
 
Location based Apps for journalists
Location based Apps for journalistsLocation based Apps for journalists
Location based Apps for journalists
 
Brand storytelling introduction @iemes fontys
Brand storytelling   introduction @iemes fontysBrand storytelling   introduction @iemes fontys
Brand storytelling introduction @iemes fontys
 
'Happiness on 13'
'Happiness on 13''Happiness on 13'
'Happiness on 13'
 
The Rise of Guerilla Journalism - and the implications for journalism education
The Rise of Guerilla Journalism - and the implications for journalism educationThe Rise of Guerilla Journalism - and the implications for journalism education
The Rise of Guerilla Journalism - and the implications for journalism education
 
Toekomst Van Media
Toekomst Van MediaToekomst Van Media
Toekomst Van Media
 
Storytelling
StorytellingStorytelling
Storytelling
 
Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'
 
Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'
 

Dernier

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

How to validate a dataset? Six steps.

Notes de l'éditeur

  1. NameWork at university – work as a writer / data journalistSomewhere in between – I do research something with a scientific goals and soething with a journalustic aim
  2. If you are in between – it is interesting that the worlds of social science and datajournalism in the field are sometimes really different – but sometimes notIf we take fo example this dataset – which is the dataset Andrew Lehren from te New York Times used in Pullitzer prize winner story about the New York Marathon you can see a blind spot
  3. … if a scientist sees this, in gereneral his first reponse it that the dataset is technically not right. There us some missing data. A problem which needs to be solved
  4. While, if a journalist sees a white spot, he is really interested in the story behind the missing data. Why is the data missing?
  5. In this case, both appriaches were all right; some runners missed checkpointBut also some technical flaws
  6. If I talk about journalists with scientists not always as ethustaistic as they could be- They can’t de al with data – they use data in a superficial
  7. Journalists – scietists are really egocentric – and their stories are not useful for the real world. They just do research to please themselves and their collegues at university
  8. At least o eon thing they agree; they assume they aee both right
  9. Because I live in both worlds, I am interested to see the real differences or notAnd one of the differences or not, is how scnetists as well astdatajournalists decide if they trust and use a dataset or not. And what I would like to discuss today is really just a startig point of this topic
  10. So if you dig into the literature of the trustworthiness of data from the perspective of a scientists – you will find a broad variety of articles in different different scietif field. Anf it’s not easy to dtect a specific line in the ariety of articles n all these different field. And there is a lack in specific guidelines how scinetists determine the trustworthiness a scientist
  11. And if you readscientifartciles about what makes a datasettrustworthy for journalists – you will find nothinhYou will only find general readings about the trustwothiness of a news source and general. Like the main principles of Gans. And a dataset could simply be one of these news sources. But on a literature level. Its is hard to compare
  12. So, with no clear starting oint, it seemed right to start with a very general question. And that’s what I did. I asked ten of me scirntif as well a
  13. Are the intentions of any influence on the dataset?
  14. So they both use their collegues as peers
  15. Using a dataaet from another source is not really common in social science -
  16. Experiments – case study