SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
1
Geographies of
crowdsourced information
and their implications
@a_ballatore
VGI-Alive, AGILE
June 2018, Lund, Sweden
Andrea Ballatore*
Department of Geography
Birkbeck, University of London
Stefano De Sabbata
School of Geography, Geology and the Env.,
University of Leicester
aballatore.space
2
Outline
1. Where are we?
2. Core research questions
3. Paradigm limitations
4. Beyond the usual suspects
5. Case studies: Search behaviour
6. [Stefano will do the rest]
3
Where are we?
4
The success of
crowdsourcing
"85% of the top global brands have used
crowdsourcing in the last ten years. […] According
to Gartner, 75% of the world’s high performing
enterprises will be using crowdsourcing by 2018."
(Deloitte, 2016)
It is when new, successful technologies withdraw
into the "woodwork of everyday banality" that
their effects become real and profound.
(Vincent Mosco, 2004)
https://ideascale.com
5
The success of
crowdsourcing
"85% of the top global brands have used
crowdsourcing in the last ten years. […] According
to Gartner, 75% of the world’s high performing
enterprises will be using crowdsourcing by 2018."
(Deloitte, 2016)
It is when new, successful technologies withdraw
into the "woodwork of everyday banality" that
their effects become real and profound.
(Vincent Mosco, 2004)
https://ideascale.com
6
Crowdsourcing + geolocation:
A mature field
(See et al, 2016)
7
Volunteered or Crowdsourced GI?
8
Placing the crowds
• Crowdsourced
geographic information
(CGI)
• From experimental phase
to oligopoly (e.g.,
Google, Facebook)
• From cyberoptimism to
cyberpessimism
https://www.cagle.com
9
Placing the crowds
• Crowdsourced
geographic information
(CGI)
• From experimental phase
to oligopoly (e.g.,
Google, Facebook)
• From cyberoptimism to
cyberpessimism
(2016)
10
Researching CGI
as domain
Using CGI for
other domains
11
Core CGI research
1. Who are the contributors and why do they engage
in spatial information production, and what
incentives work or do not work? How do they
collaborate and organise? How do we include
marginalised communities?
(Budhathoki and Haythornthwaite 2013)
2. How can we calculate the quality and fitness for
purpose of crowdsourced data in a reliable,
preferably intrinsic way? (Goodchild and Li 2012)
3. What are the limitations of such models and what
are their spatial, epistemic and cultural biases?
(Dodge and Kitchin 2013)
12
Core CGI research
1. Who are the contributors and why do they engage
in spatial information production, and what
incentives work or do not work? How do they
collaborate and organise? How do we include
marginalised communities?
(Budhathoki and Haythornthwaite 2013)
2. How can we calculate the quality and fitness for
purpose of crowdsourced data in a reliable,
preferably intrinsic way? (Goodchild and Li 2012)
3. What are the limitations of such models and what
are their spatial, epistemic and cultural biases?
(Dodge and Kitchin 2013)
13
CGI for other domains
Natural sciences: biology,
climate science, Earth
sensing
Social sciences: urban planning,
transportation,
public health, economics,
human geography
Humanities: digital humanities,
history, cultural analytics
14
Limitations of
crowdsourcing
15
• "Ignorance of crowds"
(Carr, 2007)
• Conditions for wisdom
• Menial work, no real
innovation/creativity
• Undermining paid work
• Variable quality
Limitations of the paradigm
2004
16
• "Ignorance of crowds"
(Carr, 2007)
• Conditions for wisdom
• Menial work, no real
innovation/creativity
• Undermining paid work
• Variable quality
Limitations of the paradigm
2007
17
• Large volumes of information
do not imply usefulness or
fitness for purpose
• We need representative
samples, not large samples
(e.g., random sample of
1,000 > 1M non-random)
Limitations of the paradigm
18
Each CGI source is a particular viewpoint
and will return a different image of the
social and natural world.
suifaijohnmak.wordpress.com
19
Diversity/biases of CGI
• Thematic: e.g., tourism, outdoors,
typical/atypical behaviour, sharing bias
• Demographic: Western Educated Industrialised
Rich Democratic (WEIRD) (not always!)
• Social: 90%-9%-1%, hyperactive minorities of
contributors
• Geographic: urban/rural,
developed/developing, central/peripheral,
human/natural
20
Diversity/biases of CGI
• Thematic: e.g., tourism, outdoors,
typical/atypical behaviour, sharing bias
• Demographic: Western Educated Industrialised
Rich Democratic (WEIRD) (not always!)
• Social: 90%-9%-1%, hyperactive minorities of
contributors
• Geographic: urban/rural,
developed/developing, central/peripheral,
human/natural
21
• Without centralised planning and protocols,
data quality remains uneven (coverage!)
• Wikipedia replaced Britannica, but
OpenStreetMap is not replacing Google
Maps
• CGI cannot replace established data
collection protocols and sources, but can
provide useful ancillary data
CGI strictures
22
• Without centralised planning and protocols,
data quality remains uneven (coverage!)
• Wikipedia replaced Britannica, but
OpenStreetMap is not replacing Google
Maps
• CGI cannot replace established data
collection protocols and sources, but can
provide useful ancillary data
CGI strictures
23
Reinventing wheels
• Some CGI replicates work
that has been done better
by professionals
• More useful to focus on
"missing" data:
bookscrounger.com
24
Open data vs CGI
Authoritative datasets
are becoming
cheaper/free
25
Broadening our
horizons
26
Most studies
on OSM,
Wikipedia,
Twitter, Flickr.
There's
more out
there!
https://fineartamerica.com
The usual suspects
27
28
• Google Maps
• From 5M to 50M
contributors in
2017
• 700K new places
monthly
• Gamification
29
• Hundreds of millions of users, billions
of reviews
• Measurable effects on spatial and
economic behaviour
• Sentiment about points of interest,
cities, and neighbourhoods
30
• Micro-economic data (e.g. price of onions
in India, new shops in Ghana)
• For profit, contributors are paid
• Applications: International Development,
Government, Global Security, and Business
31
Case studies
32
33
Online visibility of CGI projects
• Search engines are the key entry point
to discover new information
• Feedback loop between Wikipedia and
Google Search to attract new
contributors
• Making CGI findable and consumable
for search engines and social media
• Study on CGI on Google Search (2018)
34
Online visibility of CGI projects
• Search engines are the key entry point
to discover new information
• Feedback loop between Wikipedia and
Google Search to attract new
contributors
• Making CGI findable and consumable
for search engines and social media
• Study on CGI on Google Search (2018)
35
Interest in CGI projects
(Ballatore & Jokar Arsanjani, 2018)
(Ballatore & Jokar Arsanjani, 2018)
Search Interest in
Wikimapia/OSM combined
37
Search Interest in
Wikimapia vs OSM
(Ballatore & Jokar Arsanjani, 2018)
Andrea Ballatore
Department of Geography,
Birkbeck, University of London
Stefano De Sabbata*
School of Geography, Geology and the Env.,
University of Leicester
S c h o o l o f
G e o g r a p h y ,
G e o l o g y a n d t h e
E n v i r o n m e n t
Geographies of
crowdsourced information
and their implications
AGILE 2018 – Lund, Sweden
Case studies
• Operationalising quality in health studies
– with J Bright, S Lee, B Ganesh, DK Humphreys
– OpenStreetMap
– impact of availability on alcohol-related harm
• Analysis of global gazetteers
– with E Acheson and RS Purves
– GeoNames and Getty TGN
– patterns of coverage
• Urban Information Geographies
– with A Ballatore
– with N Tate
– quality and representativeness of VGI
Operationalising
VGI in health
studies
• with J Bright, S Lee, B
Ganesh, DK Humphreys
• OpenStreetMap
• impact of availability on
alcohol-related harm
• https://www.sciencedirect.com/scie
nce/article/pii/S1353829217305804
Quality indicator
Values Transformation
Num. of map features per square kilometre Log transformed, scaled
Average num. of edits per map feature Log transformed, scaled
Num. of users who made at least one edit Log transformed, scaled
Num. of features edited by more than one user Log transformed, scaled
Num. of days with at least one edit Log transformed, scaled
Days since the last edit in the area Inversed, scaled
Proportion of buildings with full address -1 if 0%, 0 if no buildings
• Intrinsic quality indicator based on Barron et al. (2014)
• Calculated as a linear combination of:
Replicating CRESH study
The overall conclusions are essentially identical
• significant positive correlation between density of
alcohol outlets and incidence of alcohol related harms
• an effect size which is comparable (though smaller)
The correlation
between OSM-based
estimates and CRESH
is stronger where the
quality index is higher
Analysis of global
gazetteers
• with E Acheson and RS
Purves
• GeoNames and
Getty TGN
• patterns of coverage
• https://www.sciencedirect.com/scie
nce/article/pii/S0198971516302496
Comparing GeoNames and TGN
Two distinct levels of coverages
• in “high coverage” countries, TGN same order of
magnitude as counts in GeoNames
• in “low coverage” countries, GeoNames has over 60 times
as much content as TGN
GeoNames and TGN in Great Britain
Gazetteer Features Pop place
GeoNames 54,701 16,475
Getty TGN 24,003 16,816
Ordnance Survey 50k 248,626
Ordnance Survey OpenNames 41,490
Feature
types
Urban
Information
Geographies
• with A Ballatore
• with N Tate
• quality and
representativeness of
VGI
Findings
• Overall confirmed (some) hypothesis
– representation similar biases as participation
– Higher qualifications strongest factor
– Wealth (house prices) strong factor in both, more so for Wikipedia
– Twitter strongly influenced by perc. of ppl. aged 30-44 (positively) and
households with de-pendent children (negatively)
• However
– models account only for about 44–55% of variability
– need for more explanatory factors, e.g., tourism-related activities
• …or can these difference be used as indicators?
– ethnic composition is not a factor in the UK
• Twitter and Wikipedia similar but distinct geographies
– only representative of themselves
Comparing Twitter and Wikipedia
Number of tweets
(951 obs.)
~ Number of articles
Adj. R2 = 0.490
De Sabbata
and Tate
(forthcoming)
A geodemographic approach
Conclusions
• More interaction between research on CGI
core and domain applications
– beyond technology, towards social domains
– beyond the usual suspects (OSM, etc)
• Access to data: sampling and scraping
• Systematic work on limitations is needed
– Urban Information Geographies
• Impact of GDPR on access and use of data
Thank you for your attention.
Any questions?
Stefano De Sabbata
School of Geography, Geology and the Env.,
University of Leicester
Andrea Ballatore
Department of Geography,
Birkbeck, University of London
a.ballatore@bbk.ac.uk
aballatore.space
@a_ballatore
Thanks!

Contenu connexe

Tendances

Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
crooksAndrew
 

Tendances (8)

GIS and Agent-based modeling: Part 2
GIS and Agent-based modeling: Part 2GIS and Agent-based modeling: Part 2
GIS and Agent-based modeling: Part 2
 
2016 Annual Report - UN Global Pulse
2016 Annual Report - UN Global Pulse 2016 Annual Report - UN Global Pulse
2016 Annual Report - UN Global Pulse
 
David Cowen UW-Madison Geospatial Summit 2015
David Cowen UW-Madison Geospatial Summit 2015David Cowen UW-Madison Geospatial Summit 2015
David Cowen UW-Madison Geospatial Summit 2015
 
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
Leveraging Crowdsourced data for Agent-based modeling: Opportunities, Example...
 
Not the Geography You Remember
Not the Geography You RememberNot the Geography You Remember
Not the Geography You Remember
 
Data Innovation: Generating Climate Solutions Event
Data Innovation: Generating Climate Solutions EventData Innovation: Generating Climate Solutions Event
Data Innovation: Generating Climate Solutions Event
 
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
Lecture 8: Volunteered Geographic Information (VGI) for Disaster Emergency Re...
 
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISSGSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
GSDI Liaison report on Earth Observation-related activities for the CEOS WGISS
 

Similaire à Geographies of crowdsourced information and their implications (VGI-Alive Keynote), AGILE 2018, Lund

Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as CatalystCurrent Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Martha Russell
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
michellep
 
AGIForesight _2020_Full Article
AGIForesight _2020_Full ArticleAGIForesight _2020_Full Article
AGIForesight _2020_Full Article
Martin Penney
 

Similaire à Geographies of crowdsourced information and their implications (VGI-Alive Keynote), AGILE 2018, Lund (20)

Understanding the Volunteer in VGI
Understanding the Volunteer in VGIUnderstanding the Volunteer in VGI
Understanding the Volunteer in VGI
 
Smart Citizens
Smart CitizensSmart Citizens
Smart Citizens
 
Smart Citizens
Smart CitizensSmart Citizens
Smart Citizens
 
From crowdsourced geographic information to participatory citizen science - e...
From crowdsourced geographic information to participatory citizen science - e...From crowdsourced geographic information to participatory citizen science - e...
From crowdsourced geographic information to participatory citizen science - e...
 
Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as CatalystCurrent Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
Current Disruptions in Media: Earthquakes or New Openings? Stanford as Catalyst
 
Digitization and Landscape-Research Nexus
Digitization and Landscape-Research NexusDigitization and Landscape-Research Nexus
Digitization and Landscape-Research Nexus
 
The Global Land Programme
The Global Land ProgrammeThe Global Land Programme
The Global Land Programme
 
Geographical information system : GIS and Social Media
Geographical information system : GIS and Social Media Geographical information system : GIS and Social Media
Geographical information system : GIS and Social Media
 
Open Data and Open Software Geospatial Applications
Open Data and Open Software Geospatial ApplicationsOpen Data and Open Software Geospatial Applications
Open Data and Open Software Geospatial Applications
 
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
EEO/AGI-Scotland 2015: Citizen Science and GIScience - background and common ...
 
Extreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen scienceExtreme Citizen Science: the socio-political potential of citizen science
Extreme Citizen Science: the socio-political potential of citizen science
 
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
 
Modeling of future transport systems from different point of views: data, tec...
Modeling of future transport systems from different point of views: data, tec...Modeling of future transport systems from different point of views: data, tec...
Modeling of future transport systems from different point of views: data, tec...
 
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
Free as in Puppies: Compensating for ICT Constraints in Citizen ScienceFree as in Puppies: Compensating for ICT Constraints in Citizen Science
Free as in Puppies: Compensating for ICT Constraints in Citizen Science
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
 
Citizen Science in Open Science context: measuring & understanding impacts of...
Citizen Science in Open Science context: measuring & understanding impacts of...Citizen Science in Open Science context: measuring & understanding impacts of...
Citizen Science in Open Science context: measuring & understanding impacts of...
 
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
From Telling Stories with Data to Telling Stories with Data Infrastructures: ...
 
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
Crowdsourcing and Participation in Cartography (G572 Guest Lecture)
 
AGIForesight _2020_Full Article
AGIForesight _2020_Full ArticleAGIForesight _2020_Full Article
AGIForesight _2020_Full Article
 
COST Actions: ENERGIC, Mapping and the citizen sensor.
COST Actions: ENERGIC,  Mapping and the citizen sensor.COST Actions: ENERGIC,  Mapping and the citizen sensor.
COST Actions: ENERGIC, Mapping and the citizen sensor.
 

Dernier

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Abortion pills in Riyadh +966572737505 get cytotec
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 

Dernier (20)

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 

Geographies of crowdsourced information and their implications (VGI-Alive Keynote), AGILE 2018, Lund

  • 1. 1 Geographies of crowdsourced information and their implications @a_ballatore VGI-Alive, AGILE June 2018, Lund, Sweden Andrea Ballatore* Department of Geography Birkbeck, University of London Stefano De Sabbata School of Geography, Geology and the Env., University of Leicester aballatore.space
  • 2. 2 Outline 1. Where are we? 2. Core research questions 3. Paradigm limitations 4. Beyond the usual suspects 5. Case studies: Search behaviour 6. [Stefano will do the rest]
  • 4. 4 The success of crowdsourcing "85% of the top global brands have used crowdsourcing in the last ten years. […] According to Gartner, 75% of the world’s high performing enterprises will be using crowdsourcing by 2018." (Deloitte, 2016) It is when new, successful technologies withdraw into the "woodwork of everyday banality" that their effects become real and profound. (Vincent Mosco, 2004) https://ideascale.com
  • 5. 5 The success of crowdsourcing "85% of the top global brands have used crowdsourcing in the last ten years. […] According to Gartner, 75% of the world’s high performing enterprises will be using crowdsourcing by 2018." (Deloitte, 2016) It is when new, successful technologies withdraw into the "woodwork of everyday banality" that their effects become real and profound. (Vincent Mosco, 2004) https://ideascale.com
  • 6. 6 Crowdsourcing + geolocation: A mature field (See et al, 2016)
  • 8. 8 Placing the crowds • Crowdsourced geographic information (CGI) • From experimental phase to oligopoly (e.g., Google, Facebook) • From cyberoptimism to cyberpessimism https://www.cagle.com
  • 9. 9 Placing the crowds • Crowdsourced geographic information (CGI) • From experimental phase to oligopoly (e.g., Google, Facebook) • From cyberoptimism to cyberpessimism (2016)
  • 10. 10 Researching CGI as domain Using CGI for other domains
  • 11. 11 Core CGI research 1. Who are the contributors and why do they engage in spatial information production, and what incentives work or do not work? How do they collaborate and organise? How do we include marginalised communities? (Budhathoki and Haythornthwaite 2013) 2. How can we calculate the quality and fitness for purpose of crowdsourced data in a reliable, preferably intrinsic way? (Goodchild and Li 2012) 3. What are the limitations of such models and what are their spatial, epistemic and cultural biases? (Dodge and Kitchin 2013)
  • 12. 12 Core CGI research 1. Who are the contributors and why do they engage in spatial information production, and what incentives work or do not work? How do they collaborate and organise? How do we include marginalised communities? (Budhathoki and Haythornthwaite 2013) 2. How can we calculate the quality and fitness for purpose of crowdsourced data in a reliable, preferably intrinsic way? (Goodchild and Li 2012) 3. What are the limitations of such models and what are their spatial, epistemic and cultural biases? (Dodge and Kitchin 2013)
  • 13. 13 CGI for other domains Natural sciences: biology, climate science, Earth sensing Social sciences: urban planning, transportation, public health, economics, human geography Humanities: digital humanities, history, cultural analytics
  • 15. 15 • "Ignorance of crowds" (Carr, 2007) • Conditions for wisdom • Menial work, no real innovation/creativity • Undermining paid work • Variable quality Limitations of the paradigm 2004
  • 16. 16 • "Ignorance of crowds" (Carr, 2007) • Conditions for wisdom • Menial work, no real innovation/creativity • Undermining paid work • Variable quality Limitations of the paradigm 2007
  • 17. 17 • Large volumes of information do not imply usefulness or fitness for purpose • We need representative samples, not large samples (e.g., random sample of 1,000 > 1M non-random) Limitations of the paradigm
  • 18. 18 Each CGI source is a particular viewpoint and will return a different image of the social and natural world. suifaijohnmak.wordpress.com
  • 19. 19 Diversity/biases of CGI • Thematic: e.g., tourism, outdoors, typical/atypical behaviour, sharing bias • Demographic: Western Educated Industrialised Rich Democratic (WEIRD) (not always!) • Social: 90%-9%-1%, hyperactive minorities of contributors • Geographic: urban/rural, developed/developing, central/peripheral, human/natural
  • 20. 20 Diversity/biases of CGI • Thematic: e.g., tourism, outdoors, typical/atypical behaviour, sharing bias • Demographic: Western Educated Industrialised Rich Democratic (WEIRD) (not always!) • Social: 90%-9%-1%, hyperactive minorities of contributors • Geographic: urban/rural, developed/developing, central/peripheral, human/natural
  • 21. 21 • Without centralised planning and protocols, data quality remains uneven (coverage!) • Wikipedia replaced Britannica, but OpenStreetMap is not replacing Google Maps • CGI cannot replace established data collection protocols and sources, but can provide useful ancillary data CGI strictures
  • 22. 22 • Without centralised planning and protocols, data quality remains uneven (coverage!) • Wikipedia replaced Britannica, but OpenStreetMap is not replacing Google Maps • CGI cannot replace established data collection protocols and sources, but can provide useful ancillary data CGI strictures
  • 23. 23 Reinventing wheels • Some CGI replicates work that has been done better by professionals • More useful to focus on "missing" data: bookscrounger.com
  • 24. 24 Open data vs CGI Authoritative datasets are becoming cheaper/free
  • 26. 26 Most studies on OSM, Wikipedia, Twitter, Flickr. There's more out there! https://fineartamerica.com The usual suspects
  • 27. 27
  • 28. 28 • Google Maps • From 5M to 50M contributors in 2017 • 700K new places monthly • Gamification
  • 29. 29 • Hundreds of millions of users, billions of reviews • Measurable effects on spatial and economic behaviour • Sentiment about points of interest, cities, and neighbourhoods
  • 30. 30 • Micro-economic data (e.g. price of onions in India, new shops in Ghana) • For profit, contributors are paid • Applications: International Development, Government, Global Security, and Business
  • 32. 32
  • 33. 33 Online visibility of CGI projects • Search engines are the key entry point to discover new information • Feedback loop between Wikipedia and Google Search to attract new contributors • Making CGI findable and consumable for search engines and social media • Study on CGI on Google Search (2018)
  • 34. 34 Online visibility of CGI projects • Search engines are the key entry point to discover new information • Feedback loop between Wikipedia and Google Search to attract new contributors • Making CGI findable and consumable for search engines and social media • Study on CGI on Google Search (2018)
  • 35. 35 Interest in CGI projects (Ballatore & Jokar Arsanjani, 2018)
  • 36. (Ballatore & Jokar Arsanjani, 2018) Search Interest in Wikimapia/OSM combined
  • 37. 37 Search Interest in Wikimapia vs OSM (Ballatore & Jokar Arsanjani, 2018)
  • 38. Andrea Ballatore Department of Geography, Birkbeck, University of London Stefano De Sabbata* School of Geography, Geology and the Env., University of Leicester S c h o o l o f G e o g r a p h y , G e o l o g y a n d t h e E n v i r o n m e n t Geographies of crowdsourced information and their implications AGILE 2018 – Lund, Sweden
  • 39. Case studies • Operationalising quality in health studies – with J Bright, S Lee, B Ganesh, DK Humphreys – OpenStreetMap – impact of availability on alcohol-related harm • Analysis of global gazetteers – with E Acheson and RS Purves – GeoNames and Getty TGN – patterns of coverage • Urban Information Geographies – with A Ballatore – with N Tate – quality and representativeness of VGI
  • 40. Operationalising VGI in health studies • with J Bright, S Lee, B Ganesh, DK Humphreys • OpenStreetMap • impact of availability on alcohol-related harm • https://www.sciencedirect.com/scie nce/article/pii/S1353829217305804
  • 41. Quality indicator Values Transformation Num. of map features per square kilometre Log transformed, scaled Average num. of edits per map feature Log transformed, scaled Num. of users who made at least one edit Log transformed, scaled Num. of features edited by more than one user Log transformed, scaled Num. of days with at least one edit Log transformed, scaled Days since the last edit in the area Inversed, scaled Proportion of buildings with full address -1 if 0%, 0 if no buildings • Intrinsic quality indicator based on Barron et al. (2014) • Calculated as a linear combination of:
  • 42. Replicating CRESH study The overall conclusions are essentially identical • significant positive correlation between density of alcohol outlets and incidence of alcohol related harms • an effect size which is comparable (though smaller) The correlation between OSM-based estimates and CRESH is stronger where the quality index is higher
  • 43. Analysis of global gazetteers • with E Acheson and RS Purves • GeoNames and Getty TGN • patterns of coverage • https://www.sciencedirect.com/scie nce/article/pii/S0198971516302496
  • 44. Comparing GeoNames and TGN Two distinct levels of coverages • in “high coverage” countries, TGN same order of magnitude as counts in GeoNames • in “low coverage” countries, GeoNames has over 60 times as much content as TGN
  • 45. GeoNames and TGN in Great Britain Gazetteer Features Pop place GeoNames 54,701 16,475 Getty TGN 24,003 16,816 Ordnance Survey 50k 248,626 Ordnance Survey OpenNames 41,490
  • 47. Urban Information Geographies • with A Ballatore • with N Tate • quality and representativeness of VGI
  • 48.
  • 49. Findings • Overall confirmed (some) hypothesis – representation similar biases as participation – Higher qualifications strongest factor – Wealth (house prices) strong factor in both, more so for Wikipedia – Twitter strongly influenced by perc. of ppl. aged 30-44 (positively) and households with de-pendent children (negatively) • However – models account only for about 44–55% of variability – need for more explanatory factors, e.g., tourism-related activities • …or can these difference be used as indicators? – ethnic composition is not a factor in the UK • Twitter and Wikipedia similar but distinct geographies – only representative of themselves
  • 50. Comparing Twitter and Wikipedia Number of tweets (951 obs.) ~ Number of articles Adj. R2 = 0.490
  • 51. De Sabbata and Tate (forthcoming) A geodemographic approach
  • 52. Conclusions • More interaction between research on CGI core and domain applications – beyond technology, towards social domains – beyond the usual suspects (OSM, etc) • Access to data: sampling and scraping • Systematic work on limitations is needed – Urban Information Geographies • Impact of GDPR on access and use of data
  • 53. Thank you for your attention. Any questions? Stefano De Sabbata School of Geography, Geology and the Env., University of Leicester Andrea Ballatore Department of Geography, Birkbeck, University of London