SlideShare une entreprise Scribd logo
1  sur  41
Guy Lansley
Department of Geography, UCL
g.lansley@ucl.ac.uk
@GuyLansley
The Demographics User Group
Annual Away Day
1st December 2016
New Analytical Methods for
Geocomputation
Geocomputation
Big Data and
Software
New software has had to
adapt to the growing size
and complexity of data
Kira Kowalska
Big Data shifts
• The concept of Big Data is changing and
becoming more challenging
• Emphasis on place rather than space
• Challenges to representing the dynamics of
real world phenomena
Open Source Software
Coding Languages
R and Rstudio
• Command line interface.
• Object oriented.
– You create things with names
using the “<-” symbol.
• Ten <- 5*2
• Two <- Ten/5
• Write a script of functions.
• The standard installation has
relatively few functions but more
have been made available via
open source downloadable
packages
R Scripts Workspace
Console
Multi-tab
(includes plots)
• Can also be run through
Rstudio which provides a
more user friendly GUI
Why should we conduct analysis in code?
• Accessibility
• Unrestrictive
• Automation and Consistency
• Skills development
Accessibility
Coding techniques for
spatial analysis are now
more accessible than ever
before
Using R as a GIS
• Free online training resources coming soon to the
CDRC website
• www.cdrc.ac.uk/training-capacity-building/online-courses
Slides on slideshare
Fundamentals
• Data scientists still need to understand basic
fundamentals
• i.e. Circular statistics
– Commonly overlooked
Automation and
repetition
Coding can make insight
generating more efficient and
less time-consuming
2011 Open Atlas Project
• A manual map might
typically take 5 minutes
to create - thus:
– 5 minutes X 134,567
maps = 672,835 minutes
– Or 467.2 days (no
breaks!)
www.alex-singleton.com
• Produced by Prof. Alex Singleton (CDRC, University of
Liverpool)
• R was used to automate the production of 134,567 into a
collection of PDF atlases
• This included downloading and formatting the data from the
ONS websites
2011 Open Atlas Project
• Code available here:
rpubs.com/alexsingleton/openatlas
• E.g. Step 1: Download the data
E.g. archive =
http://www.nomisweb.co.uk/output/census/2011/ks101ew_2011_oa.zip
Algorithms
Alyson Lloyd
• Use a pipeline of
methods and
decisions to analyse
data
• i.e. data cleaning
Cleaning the registered
locations of customers
based on their store
visits
Bespoke techniques
New techniques can take
advantage of advancements
in computer science
Simulating the dynamic world
New Computing Methods
• Neural Networks
– Self Organising Maps
• Machine learning
Seth Spielman
Text Mining
New techniques for
analysing unstructured data
Textual Data
Unstructured data is difficult
to quantify
Text source: https://en.wikipedia.org/wiki/Tag_cloud
Word Frequencies
Text source: Wikipedia
Word Frequencies
But it is still difficult to compare and
contrast several documents
Topic Modelling
Blei et al. (2003) Latent Dirichlet Allocation (LDA):
In this example, I have applied an
LDA to 1.3 million geotagged
Tweets from Inner London
transmitted in 2013
20 Twitter Groups
1 Photography and Sights
2 Optimism, Kindness and Positivity
3 Leisure and Attractions
4 TV and Film
5 Humour and Informal Conversations
6 Transport and Travel
7 Politics, Beliefs and Current Affairs
8 Sport and Games
9 Anticipation and Socialising
10 Business, Information and Networking
11 Pessimism and Negativity
12 Music and Musicians
13 Routine Activities
14 Food and Drink
15 Body, Appearances and Clothes
16 Social Media and Apps
17 Slang and Profanities
18 Place and Check-Ins
19 Wishes and Gratitude
20 Foreign and Other
Identifying Patterns
Time Distribution Hour
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Photography and Sights
Optimism, Kindness and Positivity
Leisure and Attractions
TV and Film
Humour and Informal Conversations
Transport and Travel
Politics, Beliefs and Current Affairs
Sport and Games
Anticipation and Socialising
Business, Information and Networking
Pessimism and Negativity
Music and Musicians
Routine Activities
Food and Drink
Body, Appearances and Clothes
Social Media and Apps
Slang and Profanities
Place and Check-Ins
Wishes and Gratitude
Foreign and Other
All Tweets
-1.5 -1 -0.5 0 0.5 1 1.5
Identifying Patterns
Identifying patterns
This map shows the density of Tweets from
the Education subtopic, relative to the
density of all Tweets in London.
UCL
University of
Westminster
Imperial College
London
London South
Bank
Kings College
Queen Mary
London Metropolitan
University
University of
Greenwich
City
Goldsmiths
SOAS
Birkbeck
LSE
UAL
University of
Roehampton
University of East
London
Identifying Patterns
Data to Information
All data which is not random
is useful to someone for
some purpose
Names
Indicators of gender
Forenames – Age (Males)
5 clusters of forenames based on their
age distributions
Most Big Data are by-
products of activities
Big Data as Exhaust
Retail Data
• Using data to infer
wider mobility
patterns
Alyson Lloyd
Data from stores near to London’s main stations
Twitter Catchments
Lloyd, A. and Cheshire, J. (2016). Mining Consumer Insights
from Geo-Located Social Media Datasets
Consumer Registers
2013 2014Matches
Comparing registers to identify household change
Consumer Registers
Modelling migration?
Interactivity
Interactive outputs make the
sharing of information
easier and more accessible
A Basic Shiny Map
ui.R server.R
Population density (2011 Census)
On CDRC Maps
• Geodemographics
– OAC, COWZ, IUC
• Retail
– Value, Sector, Change
• Metrics
– IMD, IMD Components
– Population Density
– Population Change
• Top Metric Maps
– Dwelling Ages
– Country of Birth
– Occupation
– Mode of Commute
CDRC Maps
Oliver O’Brien
The Demographic Toolkit
• Analytical web mapping system (Web GIS)
– Self-hosted raster and vector map tiles
– Open source packages (OpenStreetMap, Mapnik &
Leaflet)
• Create and analyse spatial and temporal profiles
– Standard and bespoke functional regions
– MAUP (Modifiable areal unit problem) in public policy
• Aims to be available in mid-2017
Tian Lan
The Demographic Toolkit
Tian Lan
Summary
• We have to become more comfortable with coding in
order to unlock the full potential of machines
• We are exploring new techniques to unlock new insights
from Big Data
• We are also harnessing data in novel ways to gain
insights about the population and their dynamics
• However, converting big data into wisdom is still
challenging and new techniques still need to be made
more accessible
Guy Lansley
Department of Geography, UCL
g.lansley@ucl.ac.uk
@GuyLansley
Acknowledgements
Tian Lan
Wen Li
Alyson Lloyd
Oliver O’Brien
Seth Spielman
www.cdrc.ac.uk

Contenu connexe

En vedette

Exploring the geography of the registered addresses of car models through a b...
Exploring the geography of the registered addresses of car models through a b...Exploring the geography of the registered addresses of car models through a b...
Exploring the geography of the registered addresses of car models through a b...
Guy Lansley
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysis
Abhiram Kanigolla
 

En vedette (12)

The CDRC Masters Research Dissertation Programme
The CDRC Masters Research Dissertation ProgrammeThe CDRC Masters Research Dissertation Programme
The CDRC Masters Research Dissertation Programme
 
The CDRC Masters Research Dissertation Programme - Call for Sponsors
The CDRC Masters Research Dissertation Programme - Call for SponsorsThe CDRC Masters Research Dissertation Programme - Call for Sponsors
The CDRC Masters Research Dissertation Programme - Call for Sponsors
 
Creating an Output Area Classification of Cultural and Ethnic Heritage
Creating an Output Area Classification of Cultural and Ethnic HeritageCreating an Output Area Classification of Cultural and Ethnic Heritage
Creating an Output Area Classification of Cultural and Ethnic Heritage
 
Exploring the geography of the registered addresses of car models through a b...
Exploring the geography of the registered addresses of car models through a b...Exploring the geography of the registered addresses of car models through a b...
Exploring the geography of the registered addresses of car models through a b...
 
Inkscape cartography
Inkscape cartographyInkscape cartography
Inkscape cartography
 
Maps with leafletR
Maps with leafletRMaps with leafletR
Maps with leafletR
 
QGIS & Inkscape: Carographic Tools for Attractive Maps
QGIS & Inkscape: Carographic Tools for Attractive MapsQGIS & Inkscape: Carographic Tools for Attractive Maps
QGIS & Inkscape: Carographic Tools for Attractive Maps
 
Geospatial Data in R
Geospatial Data in RGeospatial Data in R
Geospatial Data in R
 
Spatial Analysis with R - the Good, the Bad, and the Pretty
Spatial Analysis with R - the Good, the Bad, and the PrettySpatial Analysis with R - the Good, the Bad, and the Pretty
Spatial Analysis with R - the Good, the Bad, and the Pretty
 
R programming language in spatial analysis
R programming language in spatial analysisR programming language in spatial analysis
R programming language in spatial analysis
 
R Spatial Analysis using SP
R Spatial Analysis using SPR Spatial Analysis using SP
R Spatial Analysis using SP
 
Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with R
 

Similaire à New analytical methods for geocomputation - Guy Lansley, UCL

Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
Thinkful
 

Similaire à New analytical methods for geocomputation - Guy Lansley, UCL (20)

Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageGeospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Data science
Data scienceData science
Data science
 
Datascience
DatascienceDatascience
Datascience
 
Data sciences and marketing analytics
Data sciences and marketing analyticsData sciences and marketing analytics
Data sciences and marketing analytics
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
Big Data et eGovernment
Big Data et eGovernmentBig Data et eGovernment
Big Data et eGovernment
 
Understanding Human Mobility
Understanding Human MobilityUnderstanding Human Mobility
Understanding Human Mobility
 
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
 
Big data tutorial_part4
Big data tutorial_part4Big data tutorial_part4
Big data tutorial_part4
 
MongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big DataMongoDB & Hadoop - Understanding Your Big Data
MongoDB & Hadoop - Understanding Your Big Data
 
Digital Transformation: Why Public Sector Customers are Moving to the Cloud
Digital Transformation: Why Public Sector Customers are Moving to the CloudDigital Transformation: Why Public Sector Customers are Moving to the Cloud
Digital Transformation: Why Public Sector Customers are Moving to the Cloud
 
Local Open Data: a perspective from local government in England 2014
Local Open Data: a perspective from local government in England 2014Local Open Data: a perspective from local government in England 2014
Local Open Data: a perspective from local government in England 2014
 
Local Open Data: A perspective from local government in England by Gesche Schmid
Local Open Data: A perspective from local government in England by Gesche SchmidLocal Open Data: A perspective from local government in England by Gesche Schmid
Local Open Data: A perspective from local government in England by Gesche Schmid
 
Big Data Landscape 2018
Big Data Landscape 2018Big Data Landscape 2018
Big Data Landscape 2018
 
Neo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperativeNeo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperative
 

Dernier

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
HyderabadDolls
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 

Dernier (20)

Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Rohtak [ 7014168258 ] Call Me For Genuine Models We...
 
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service AvailableVastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 

New analytical methods for geocomputation - Guy Lansley, UCL

  • 1. Guy Lansley Department of Geography, UCL g.lansley@ucl.ac.uk @GuyLansley The Demographics User Group Annual Away Day 1st December 2016 New Analytical Methods for Geocomputation
  • 3. Big Data and Software New software has had to adapt to the growing size and complexity of data Kira Kowalska
  • 4. Big Data shifts • The concept of Big Data is changing and becoming more challenging • Emphasis on place rather than space • Challenges to representing the dynamics of real world phenomena
  • 7. R and Rstudio • Command line interface. • Object oriented. – You create things with names using the “<-” symbol. • Ten <- 5*2 • Two <- Ten/5 • Write a script of functions. • The standard installation has relatively few functions but more have been made available via open source downloadable packages R Scripts Workspace Console Multi-tab (includes plots) • Can also be run through Rstudio which provides a more user friendly GUI
  • 8. Why should we conduct analysis in code? • Accessibility • Unrestrictive • Automation and Consistency • Skills development
  • 9. Accessibility Coding techniques for spatial analysis are now more accessible than ever before
  • 10. Using R as a GIS • Free online training resources coming soon to the CDRC website • www.cdrc.ac.uk/training-capacity-building/online-courses Slides on slideshare
  • 11. Fundamentals • Data scientists still need to understand basic fundamentals • i.e. Circular statistics – Commonly overlooked
  • 12. Automation and repetition Coding can make insight generating more efficient and less time-consuming
  • 13. 2011 Open Atlas Project • A manual map might typically take 5 minutes to create - thus: – 5 minutes X 134,567 maps = 672,835 minutes – Or 467.2 days (no breaks!) www.alex-singleton.com • Produced by Prof. Alex Singleton (CDRC, University of Liverpool) • R was used to automate the production of 134,567 into a collection of PDF atlases • This included downloading and formatting the data from the ONS websites
  • 14. 2011 Open Atlas Project • Code available here: rpubs.com/alexsingleton/openatlas • E.g. Step 1: Download the data E.g. archive = http://www.nomisweb.co.uk/output/census/2011/ks101ew_2011_oa.zip
  • 15. Algorithms Alyson Lloyd • Use a pipeline of methods and decisions to analyse data • i.e. data cleaning Cleaning the registered locations of customers based on their store visits
  • 16. Bespoke techniques New techniques can take advantage of advancements in computer science
  • 18. New Computing Methods • Neural Networks – Self Organising Maps • Machine learning Seth Spielman
  • 19. Text Mining New techniques for analysing unstructured data
  • 20. Textual Data Unstructured data is difficult to quantify
  • 22. Text source: Wikipedia Word Frequencies But it is still difficult to compare and contrast several documents
  • 23. Topic Modelling Blei et al. (2003) Latent Dirichlet Allocation (LDA): In this example, I have applied an LDA to 1.3 million geotagged Tweets from Inner London transmitted in 2013
  • 24. 20 Twitter Groups 1 Photography and Sights 2 Optimism, Kindness and Positivity 3 Leisure and Attractions 4 TV and Film 5 Humour and Informal Conversations 6 Transport and Travel 7 Politics, Beliefs and Current Affairs 8 Sport and Games 9 Anticipation and Socialising 10 Business, Information and Networking 11 Pessimism and Negativity 12 Music and Musicians 13 Routine Activities 14 Food and Drink 15 Body, Appearances and Clothes 16 Social Media and Apps 17 Slang and Profanities 18 Place and Check-Ins 19 Wishes and Gratitude 20 Foreign and Other Identifying Patterns
  • 25. Time Distribution Hour 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Photography and Sights Optimism, Kindness and Positivity Leisure and Attractions TV and Film Humour and Informal Conversations Transport and Travel Politics, Beliefs and Current Affairs Sport and Games Anticipation and Socialising Business, Information and Networking Pessimism and Negativity Music and Musicians Routine Activities Food and Drink Body, Appearances and Clothes Social Media and Apps Slang and Profanities Place and Check-Ins Wishes and Gratitude Foreign and Other All Tweets -1.5 -1 -0.5 0 0.5 1 1.5 Identifying Patterns
  • 26. Identifying patterns This map shows the density of Tweets from the Education subtopic, relative to the density of all Tweets in London. UCL University of Westminster Imperial College London London South Bank Kings College Queen Mary London Metropolitan University University of Greenwich City Goldsmiths SOAS Birkbeck LSE UAL University of Roehampton University of East London Identifying Patterns
  • 27. Data to Information All data which is not random is useful to someone for some purpose
  • 29. Forenames – Age (Males) 5 clusters of forenames based on their age distributions
  • 30. Most Big Data are by- products of activities Big Data as Exhaust
  • 31. Retail Data • Using data to infer wider mobility patterns Alyson Lloyd Data from stores near to London’s main stations
  • 32. Twitter Catchments Lloyd, A. and Cheshire, J. (2016). Mining Consumer Insights from Geo-Located Social Media Datasets
  • 33. Consumer Registers 2013 2014Matches Comparing registers to identify household change
  • 35. Interactivity Interactive outputs make the sharing of information easier and more accessible
  • 36. A Basic Shiny Map ui.R server.R Population density (2011 Census)
  • 37. On CDRC Maps • Geodemographics – OAC, COWZ, IUC • Retail – Value, Sector, Change • Metrics – IMD, IMD Components – Population Density – Population Change • Top Metric Maps – Dwelling Ages – Country of Birth – Occupation – Mode of Commute CDRC Maps Oliver O’Brien
  • 38. The Demographic Toolkit • Analytical web mapping system (Web GIS) – Self-hosted raster and vector map tiles – Open source packages (OpenStreetMap, Mapnik & Leaflet) • Create and analyse spatial and temporal profiles – Standard and bespoke functional regions – MAUP (Modifiable areal unit problem) in public policy • Aims to be available in mid-2017 Tian Lan
  • 40. Summary • We have to become more comfortable with coding in order to unlock the full potential of machines • We are exploring new techniques to unlock new insights from Big Data • We are also harnessing data in novel ways to gain insights about the population and their dynamics • However, converting big data into wisdom is still challenging and new techniques still need to be made more accessible
  • 41. Guy Lansley Department of Geography, UCL g.lansley@ucl.ac.uk @GuyLansley Acknowledgements Tian Lan Wen Li Alyson Lloyd Oliver O’Brien Seth Spielman www.cdrc.ac.uk

Notes de l'éditeur

  1. Old years = ibm. WE ARE NOW DATA RICH Palce = social media, networks Time = interactivity
  2. Issues of there being no insurance embedded
  3. Python and R
  4. Open source, New methods new understanding
  5. Need for training
  6. Look at trip distributions per small area to store locations  Categorise into primary , secondary, tertiary destinations  Look at frequency customers perform irregular journeys 
  7. FOCUS ON UNSUPERVISED - SCIENCE Artificial networks – SOM – creates 2d rep of input space (competitive learning)
  8. sentiment
  9. Count words
  10. Data matching
  11. Leisure TV & Film Transport Leisure Food and drink
  12. Transactions Registrations Social media posts
  13. Registers, 55m people, no link matching
  14. Create file of movers File of leavers Look for identical combinations of names 3m move, 800,000 singular combinations What to do about duplicates? Distance? OAC?
  15. World is dynamic – So outputs should be interactive
  16. Wisdom - interactivity