SlideShare une entreprise Scribd logo
1  sur  33
UCL DEPARTMENT OF GEOGRAPHY




  Open Geodemographics: Open Tools
  and the 2011 OAC
  Chris Gale*           Muhammad Adnan Paul Longley
  mapblog.in            gis-tech.co.uk paul-longley.com
  @geogale              @gisandtech

  * Conference attendance kindly supported by RGS-IBG funded QMRG bursary




  UCL Department of Geography, Gower Street, London, WC1E 6BT
UCL DEPARTMENT OF GEOGRAPHY




  Outline
  •   What is Geodemographics?
  •   Need for Open Geodemographics
  •   GeodemCreator
  •   The 2011 Output Area Classification
  •   Summary
UCL DEPARTMENT OF GEOGRAPHY




  Geodemographics
  • The analysis of people by where they live
  • Areas can be described by the characteristics and
    attitudes of those people who live in them
  • Based on the concept that similar people with similar
    characteristics are more likely to live within the same
    locality and that such area types will be distributed in
    different locations across a geographical space
  • Commercial (MOSAIC, ACORN) and free (OAC)
    classifications available
UCL DEPARTMENT OF GEOGRAPHY




  Commercial Geodemographic
  Classifications
  • Created as ‘black box’ systems
    (Longley and Singleton, 2009)
  • Closed methods are used with little documentation
  • Little information is given regarding the data
    inputs, normalisation and weighting procedures, and
    clustering methods employed
UCL DEPARTMENT OF GEOGRAPHY




  Need of Open, Transparent, and Flexible
  Classifications
  • Increased amount of data sources due to ‘open data’
    initiatives
        – ONS NeSS data exchange, London data store, Crime data API
  • Need of open methods
        – Open method of Estimation, Normalisation, and Clustering
          procedures
  • Open public consultation
UCL DEPARTMENT OF GEOGRAPHY




  Need of Open, Transparent, and Flexible
  Classifications
  • A number of statistical packages could be used for
    building geodemographic classifications
        – R, SPSS, Microsoft Excel
  • No unified software utility exists that could be used for
    building open, transparent, and flexible classifications
  • ‘GeodemCreator’ is a unified software utility for
    building geodemographic classifications
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator
  • A cross platform java software utility for building
    geodemographic classifications
  • Requires ‘Java’ and ‘R’ installed on user’s machine
  • Geodemographic classifications could be created for
    any geographical level and by using any data set
  • Users can combine census data with their own data
    sources
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator
  • Operates in ‘Basic’ and ‘Advanced’ modes
        – Basic Mode is for inexperienced and new users
        – Advanced modes is for experienced users
  • Clusters the data by using k-means clustering
    algorithm
UCL DEPARTMENT OF GEOGRAPHY
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator Case Study
  • A Socio-economic and Ethnic classification of Greater
    London
  • Created by using 41 OAC variables and 12 ethnicity
    variables (created from ethnicity data source
    http://worldnames.publicprofiler.org)
  • GeodemCreator was used for building the final
    classification
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator Case Study Data Sources
  • Variables V1 to V41 from the 2001 OAC
  • Variables V42 to V53 ethnicity
        V42: ‘European’ ethnic group
        V43: ‘East Asian & Pacific’ ethnic group
        V44: ‘Muslim’ ethnic group
        V45: ‘Greek’ ethnic group
        V46: ‘English’ ethnic group
        V47: ‘Nordic’ ethnic group
        V48: ‘African’ ethnic group
        V49: ‘Japanese’ ethnic group
        V50: ‘Hispanic’ ethnic group
        V51: ‘Celtic’ ethnic group
        V52: ‘Jewish’ ethnic group
        V53: ‘South Asian’ ethnic group
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator Case Study Results
  • A Socio-economic and Ethnic classification of Greater
    London:
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator Case Study Results
  • GeodemCreator also produces radial charts for each
    cluster solution




        English and European ethnic groups   Well off and educated Asian families
        living in suburban areas
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator Case Study Results




        English, European, and Celtic fringe   Poor Asian Families
        city commuters
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator Case Study Results




        Childless European city dwellers   Native blue collar communities
UCL DEPARTMENT OF GEOGRAPHY




  GeodemCreator Case Study Results




        English and European ethnic groups
        living in council properties
UCL DEPARTMENT OF GEOGRAPHY




  The 2001 Output Area Classification (OAC)
  • Groups the UK population
    into:
        – 7 Supergroups
        – 21 Groups
        – 52 Subgroups
  • Only data source used is the
    2001 Census
        – 41 Variables
  • Variety of organisations use it
    including local government
    and commercial companies
UCL DEPARTMENT OF GEOGRAPHY




  The 2011 Output Area Classification
  • Building on the success of the 2001 OAC
  • The 2001 OAC’s real achievement was showing that
    open-source geodemographic classifications were
    possible
  • Can utilise developments in computing over the past 6
    years, since the 2001 OAC’s publication, to make
    improvements
  • Can be produced using open-source software (if
    required) with a fully open and transparent
    methodology
UCL DEPARTMENT OF GEOGRAPHY




  The 2011 Output Area Classification
  • Not just a repeat of the 2001 Output Area
    Classification
  • Methodology that will possibly not rely on 100%
    Census data
  • Enhanced outputs to cater for different potential users
  • Designed to allow easy creation of bespoke variants
        – Variables and/or Geography
        – Automated variable selection depending on user criteria
             • e.g. variables used for a national classification not necessarily being
               suitable for a regional classification
UCL DEPARTMENT OF GEOGRAPHY




  2011 OAC Variables
  • Code used to auto-select best variables for desired
    purpose
  • Allows for a fully transparent and repeatable
    methodology
        – Variable selection the only “black box” element of the 2001
          OAC
  • Allows for wider scale bespoke geodemographics
        – A user with no geodemographics experience can produce
          their own classification by selecting the
          variables, standardisation method, number of clusters.
        – Removes any technical barriers that could prevent wider
          adoption of bespoke geodemographic classifications.
UCL DEPARTMENT OF GEOGRAPHY




  Bespoke Geodemographic Classifications
  • Categorised into 3 main types:
        – Using the same data already provided in classification.
        – Changing the number of variables used to create a
          classification.
        – Uploading other data that was not originally included into a
          pre-existing classification or creating a new classification
          from scratch.
  • In the case of OAC this could resolve a problem when
    used at a regional level
        – London is an example of one such region that OAC does not
          classify very well.
UCL DEPARTMENT OF GEOGRAPHY
UCL DEPARTMENT OF GEOGRAPHY




  The Hull City Council Classification
  • Bespoke free area
    classification of
    Hull
  • 45 Census
    Variables used
  • 10 Groups in 3
    hierarchies
UCL DEPARTMENT OF GEOGRAPHY




  2011 OAC and Open Data
  • Would it be better to use potentially “newer” Open
    Data (when compared with the 2011 Census)?
  • How much of a problem is the lack of data currently
    available at OA level?
  • Using Open Data raises a lot of questions:
        –   What sources of Open Data should be used?
        –   What should the coverage of the Open Data be?
        –   Does the integrity of the Open Data matter?
        –   How often should the Open Data sources be updated?
  • Beyond 2011
UCL DEPARTMENT OF GEOGRAPHY




  On-The-Fly Clustering
  • To meet the changing and varying needs of users a
    dynamic classification environment needs to be
    created
  • Ability to create bespoke classifications a requirement
        – both for different geographies (e.g. London or UK) and the
          range and number of variables utilised (e.g. Census and/or
          non-Census) with an additional weighting capacity
  • Will require clustering to happen in real-time
  • Research of users specific has been undertaken
        – 2011 OAC User Engagement (run in partnership with the
          ONS)
        – Results to be published by ONS by late April
UCL DEPARTMENT OF GEOGRAPHY




  On-The-Fly Clustering Objectives
  • Find optimum real-time clustering solution
        – Using mean Within-Cluster Sum of Squares (WCSS) value to
          determine optimum cluster solution using K-Means.
        – Number of cluster algorithm iterations to use to create a good
          clustering solution that does not result in poor functionality.
  • Create repeatability
        – Overcome inherent random seeding of K-Means that results
          in an OA remaining in the same cluster group but being given
          a random cluster assignment (e.g. a number from 1 to 7) for
          every iteration.
  • Incorporate different data sources
        – Both Census and non-Census data
UCL DEPARTMENT OF GEOGRAPHY




  What the Within-Cluster Sum of Squares
  Value means
  • Lower the mean value the more homogenous (i.e.
    better) the final cluster groupings are
        – Clustering using the lowest WCSS value can therefore be
          considered to create the optimum cluster groupings.
  • Using anything other than optimum cluster solution can
    have differing results depending on the dataset and
    level of geography
UCL DEPARTMENT OF GEOGRAPHY
UCL DEPARTMENT OF GEOGRAPHY
UCL DEPARTMENT OF GEOGRAPHY
UCL DEPARTMENT OF GEOGRAPHY
UCL DEPARTMENT OF GEOGRAPHY




  Summary
  • The 2001 OAC was an important first step for open
    source geodemographics
  • The 2011 OAC can build on the successes of the 2001
    OAC
  • Tools like GeodemCreator can be used to create
    bespoke geodemographic classifications easily and
    without any “expert” knowledge
  • The 2011 OAC is still in the planning phase but should
    be released in some form by late 2012/early 2013
UCL DEPARTMENT OF GEOGRAPHY




                                 Any
                              Questions?

Contenu connexe

En vedette (11)

GMO
GMOGMO
GMO
 
Cowabunga surf shop
Cowabunga surf shopCowabunga surf shop
Cowabunga surf shop
 
La respiracion
La respiracionLa respiracion
La respiracion
 
Bi p2 pra akhir tahun
Bi p2 pra akhir tahunBi p2 pra akhir tahun
Bi p2 pra akhir tahun
 
A case for using social media with learning ppt
A case for using social media with learning pptA case for using social media with learning ppt
A case for using social media with learning ppt
 
Prezentacja grzegorz cybula
Prezentacja grzegorz cybulaPrezentacja grzegorz cybula
Prezentacja grzegorz cybula
 
Untitleddocument
UntitleddocumentUntitleddocument
Untitleddocument
 
01agency credentials
01agency credentials01agency credentials
01agency credentials
 
Empathy map
Empathy mapEmpathy map
Empathy map
 
Hello
HelloHello
Hello
 
Perbaharui buku-pinjaman-secara-dalam-talian
Perbaharui buku-pinjaman-secara-dalam-talianPerbaharui buku-pinjaman-secara-dalam-talian
Perbaharui buku-pinjaman-secara-dalam-talian
 

Similaire à Open Geodemographics: Open Tools and the 2011 OAC

Geodemographic Output Area Classifications for London, 2001-2011
Geodemographic Output Area Classifications for London, 2001-2011Geodemographic Output Area Classifications for London, 2001-2011
Geodemographic Output Area Classifications for London, 2001-2011Chris
 
Geodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtodsGeodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtodsDr Muhammad Adnan
 
CensusGIV - Geographic Information Visualisation of Census Data
CensusGIV - Geographic Information Visualisation of Census DataCensusGIV - Geographic Information Visualisation of Census Data
CensusGIV - Geographic Information Visualisation of Census DataCASA, UCL
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2Gianpaolo Coro
 
COBWEB technology platform and future development needs
COBWEB technology platform and future development needsCOBWEB technology platform and future development needs
COBWEB technology platform and future development needsEDINA, University of Edinburgh
 
COBWEB technology platform and future development needs, ISPRA 2016
COBWEB technology platform and future development needs, ISPRA 2016COBWEB technology platform and future development needs, ISPRA 2016
COBWEB technology platform and future development needs, ISPRA 2016COBWEB Project
 
Item 5: Introduction to the Global Spectral Calibration Library
Item 5: Introduction to the Global Spectral Calibration LibraryItem 5: Introduction to the Global Spectral Calibration Library
Item 5: Introduction to the Global Spectral Calibration LibrarySoils FAO-GSP
 
COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Project
 
Unlocking the geospatial potential of survey data
Unlocking the geospatial potential of survey dataUnlocking the geospatial potential of survey data
Unlocking the geospatial potential of survey datatomensom
 
TEAM 3: Improving Open Land Use Map by using Satellite Data
TEAM 3: Improving Open Land Use Map by using Satellite DataTEAM 3: Improving Open Land Use Map by using Satellite Data
TEAM 3: Improving Open Land Use Map by using Satellite Dataplan4all
 
Item 6: Discussion on the Global Spectral Calibration Library
Item 6: Discussion on the Global Spectral Calibration LibraryItem 6: Discussion on the Global Spectral Calibration Library
Item 6: Discussion on the Global Spectral Calibration LibrarySoils FAO-GSP
 
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...Geokomunita
 
Land Suitability Analysis.pdf
Land Suitability Analysis.pdfLand Suitability Analysis.pdf
Land Suitability Analysis.pdfMarkMwari
 
KTH-Texxi Project 2010
KTH-Texxi Project 2010KTH-Texxi Project 2010
KTH-Texxi Project 2010Texxi Global
 
Volunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developmentsVolunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developmentsDavid Wallom
 
Prospection, Prediction and Management of Archaeological Sites in Alluvial En...
Prospection, Prediction and Management of Archaeological Sites in Alluvial En...Prospection, Prediction and Management of Archaeological Sites in Alluvial En...
Prospection, Prediction and Management of Archaeological Sites in Alluvial En...Keith Challis
 
Citizen Observatories: A Standards Based Architecture - Dr Ingo Simonis, OGCE...
Citizen Observatories:A Standards Based Architecture - Dr Ingo Simonis, OGCE...Citizen Observatories:A Standards Based Architecture - Dr Ingo Simonis, OGCE...
Citizen Observatories: A Standards Based Architecture - Dr Ingo Simonis, OGCE...COBWEB Project
 

Similaire à Open Geodemographics: Open Tools and the 2011 OAC (20)

Geodemographic Output Area Classifications for London, 2001-2011
Geodemographic Output Area Classifications for London, 2001-2011Geodemographic Output Area Classifications for London, 2001-2011
Geodemographic Output Area Classifications for London, 2001-2011
 
Geodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtodsGeodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtods
 
CensusGIV - Geographic Information Visualisation of Census Data
CensusGIV - Geographic Information Visualisation of Census DataCensusGIV - Geographic Information Visualisation of Census Data
CensusGIV - Geographic Information Visualisation of Census Data
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 2
 
COBWEB technology platform and future development needs
COBWEB technology platform and future development needsCOBWEB technology platform and future development needs
COBWEB technology platform and future development needs
 
COBWEB technology platform and future development needs, ISPRA 2016
COBWEB technology platform and future development needs, ISPRA 2016COBWEB technology platform and future development needs, ISPRA 2016
COBWEB technology platform and future development needs, ISPRA 2016
 
Item 5: Introduction to the Global Spectral Calibration Library
Item 5: Introduction to the Global Spectral Calibration LibraryItem 5: Introduction to the Global Spectral Calibration Library
Item 5: Introduction to the Global Spectral Calibration Library
 
EPOS GNSS Data and Products TCS - What we do...
EPOS GNSS Data and Products TCS - What we do...EPOS GNSS Data and Products TCS - What we do...
EPOS GNSS Data and Products TCS - What we do...
 
COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016
 
Unlocking the geospatial potential of survey data
Unlocking the geospatial potential of survey dataUnlocking the geospatial potential of survey data
Unlocking the geospatial potential of survey data
 
TEAM 3: Improving Open Land Use Map by using Satellite Data
TEAM 3: Improving Open Land Use Map by using Satellite DataTEAM 3: Improving Open Land Use Map by using Satellite Data
TEAM 3: Improving Open Land Use Map by using Satellite Data
 
Item 6: Discussion on the Global Spectral Calibration Library
Item 6: Discussion on the Global Spectral Calibration LibraryItem 6: Discussion on the Global Spectral Calibration Library
Item 6: Discussion on the Global Spectral Calibration Library
 
Introduction to the COBWEB Project, January 2013
Introduction to the COBWEB Project, January 2013Introduction to the COBWEB Project, January 2013
Introduction to the COBWEB Project, January 2013
 
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
 
ONS local presents clustering
ONS local presents clusteringONS local presents clustering
ONS local presents clustering
 
Land Suitability Analysis.pdf
Land Suitability Analysis.pdfLand Suitability Analysis.pdf
Land Suitability Analysis.pdf
 
KTH-Texxi Project 2010
KTH-Texxi Project 2010KTH-Texxi Project 2010
KTH-Texxi Project 2010
 
Volunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developmentsVolunteer Crowd Computing and Federated Cloud developments
Volunteer Crowd Computing and Federated Cloud developments
 
Prospection, Prediction and Management of Archaeological Sites in Alluvial En...
Prospection, Prediction and Management of Archaeological Sites in Alluvial En...Prospection, Prediction and Management of Archaeological Sites in Alluvial En...
Prospection, Prediction and Management of Archaeological Sites in Alluvial En...
 
Citizen Observatories: A Standards Based Architecture - Dr Ingo Simonis, OGCE...
Citizen Observatories:A Standards Based Architecture - Dr Ingo Simonis, OGCE...Citizen Observatories:A Standards Based Architecture - Dr Ingo Simonis, OGCE...
Citizen Observatories: A Standards Based Architecture - Dr Ingo Simonis, OGCE...
 

Dernier

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Dernier (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Open Geodemographics: Open Tools and the 2011 OAC

  • 1. UCL DEPARTMENT OF GEOGRAPHY Open Geodemographics: Open Tools and the 2011 OAC Chris Gale* Muhammad Adnan Paul Longley mapblog.in gis-tech.co.uk paul-longley.com @geogale @gisandtech * Conference attendance kindly supported by RGS-IBG funded QMRG bursary UCL Department of Geography, Gower Street, London, WC1E 6BT
  • 2. UCL DEPARTMENT OF GEOGRAPHY Outline • What is Geodemographics? • Need for Open Geodemographics • GeodemCreator • The 2011 Output Area Classification • Summary
  • 3. UCL DEPARTMENT OF GEOGRAPHY Geodemographics • The analysis of people by where they live • Areas can be described by the characteristics and attitudes of those people who live in them • Based on the concept that similar people with similar characteristics are more likely to live within the same locality and that such area types will be distributed in different locations across a geographical space • Commercial (MOSAIC, ACORN) and free (OAC) classifications available
  • 4. UCL DEPARTMENT OF GEOGRAPHY Commercial Geodemographic Classifications • Created as ‘black box’ systems (Longley and Singleton, 2009) • Closed methods are used with little documentation • Little information is given regarding the data inputs, normalisation and weighting procedures, and clustering methods employed
  • 5. UCL DEPARTMENT OF GEOGRAPHY Need of Open, Transparent, and Flexible Classifications • Increased amount of data sources due to ‘open data’ initiatives – ONS NeSS data exchange, London data store, Crime data API • Need of open methods – Open method of Estimation, Normalisation, and Clustering procedures • Open public consultation
  • 6. UCL DEPARTMENT OF GEOGRAPHY Need of Open, Transparent, and Flexible Classifications • A number of statistical packages could be used for building geodemographic classifications – R, SPSS, Microsoft Excel • No unified software utility exists that could be used for building open, transparent, and flexible classifications • ‘GeodemCreator’ is a unified software utility for building geodemographic classifications
  • 7. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator • A cross platform java software utility for building geodemographic classifications • Requires ‘Java’ and ‘R’ installed on user’s machine • Geodemographic classifications could be created for any geographical level and by using any data set • Users can combine census data with their own data sources
  • 8. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator • Operates in ‘Basic’ and ‘Advanced’ modes – Basic Mode is for inexperienced and new users – Advanced modes is for experienced users • Clusters the data by using k-means clustering algorithm
  • 9. UCL DEPARTMENT OF GEOGRAPHY
  • 10. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator Case Study • A Socio-economic and Ethnic classification of Greater London • Created by using 41 OAC variables and 12 ethnicity variables (created from ethnicity data source http://worldnames.publicprofiler.org) • GeodemCreator was used for building the final classification
  • 11. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator Case Study Data Sources • Variables V1 to V41 from the 2001 OAC • Variables V42 to V53 ethnicity V42: ‘European’ ethnic group V43: ‘East Asian & Pacific’ ethnic group V44: ‘Muslim’ ethnic group V45: ‘Greek’ ethnic group V46: ‘English’ ethnic group V47: ‘Nordic’ ethnic group V48: ‘African’ ethnic group V49: ‘Japanese’ ethnic group V50: ‘Hispanic’ ethnic group V51: ‘Celtic’ ethnic group V52: ‘Jewish’ ethnic group V53: ‘South Asian’ ethnic group
  • 12. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator Case Study Results • A Socio-economic and Ethnic classification of Greater London:
  • 13. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator Case Study Results • GeodemCreator also produces radial charts for each cluster solution English and European ethnic groups Well off and educated Asian families living in suburban areas
  • 14. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator Case Study Results English, European, and Celtic fringe Poor Asian Families city commuters
  • 15. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator Case Study Results Childless European city dwellers Native blue collar communities
  • 16. UCL DEPARTMENT OF GEOGRAPHY GeodemCreator Case Study Results English and European ethnic groups living in council properties
  • 17. UCL DEPARTMENT OF GEOGRAPHY The 2001 Output Area Classification (OAC) • Groups the UK population into: – 7 Supergroups – 21 Groups – 52 Subgroups • Only data source used is the 2001 Census – 41 Variables • Variety of organisations use it including local government and commercial companies
  • 18. UCL DEPARTMENT OF GEOGRAPHY The 2011 Output Area Classification • Building on the success of the 2001 OAC • The 2001 OAC’s real achievement was showing that open-source geodemographic classifications were possible • Can utilise developments in computing over the past 6 years, since the 2001 OAC’s publication, to make improvements • Can be produced using open-source software (if required) with a fully open and transparent methodology
  • 19. UCL DEPARTMENT OF GEOGRAPHY The 2011 Output Area Classification • Not just a repeat of the 2001 Output Area Classification • Methodology that will possibly not rely on 100% Census data • Enhanced outputs to cater for different potential users • Designed to allow easy creation of bespoke variants – Variables and/or Geography – Automated variable selection depending on user criteria • e.g. variables used for a national classification not necessarily being suitable for a regional classification
  • 20. UCL DEPARTMENT OF GEOGRAPHY 2011 OAC Variables • Code used to auto-select best variables for desired purpose • Allows for a fully transparent and repeatable methodology – Variable selection the only “black box” element of the 2001 OAC • Allows for wider scale bespoke geodemographics – A user with no geodemographics experience can produce their own classification by selecting the variables, standardisation method, number of clusters. – Removes any technical barriers that could prevent wider adoption of bespoke geodemographic classifications.
  • 21. UCL DEPARTMENT OF GEOGRAPHY Bespoke Geodemographic Classifications • Categorised into 3 main types: – Using the same data already provided in classification. – Changing the number of variables used to create a classification. – Uploading other data that was not originally included into a pre-existing classification or creating a new classification from scratch. • In the case of OAC this could resolve a problem when used at a regional level – London is an example of one such region that OAC does not classify very well.
  • 22. UCL DEPARTMENT OF GEOGRAPHY
  • 23. UCL DEPARTMENT OF GEOGRAPHY The Hull City Council Classification • Bespoke free area classification of Hull • 45 Census Variables used • 10 Groups in 3 hierarchies
  • 24. UCL DEPARTMENT OF GEOGRAPHY 2011 OAC and Open Data • Would it be better to use potentially “newer” Open Data (when compared with the 2011 Census)? • How much of a problem is the lack of data currently available at OA level? • Using Open Data raises a lot of questions: – What sources of Open Data should be used? – What should the coverage of the Open Data be? – Does the integrity of the Open Data matter? – How often should the Open Data sources be updated? • Beyond 2011
  • 25. UCL DEPARTMENT OF GEOGRAPHY On-The-Fly Clustering • To meet the changing and varying needs of users a dynamic classification environment needs to be created • Ability to create bespoke classifications a requirement – both for different geographies (e.g. London or UK) and the range and number of variables utilised (e.g. Census and/or non-Census) with an additional weighting capacity • Will require clustering to happen in real-time • Research of users specific has been undertaken – 2011 OAC User Engagement (run in partnership with the ONS) – Results to be published by ONS by late April
  • 26. UCL DEPARTMENT OF GEOGRAPHY On-The-Fly Clustering Objectives • Find optimum real-time clustering solution – Using mean Within-Cluster Sum of Squares (WCSS) value to determine optimum cluster solution using K-Means. – Number of cluster algorithm iterations to use to create a good clustering solution that does not result in poor functionality. • Create repeatability – Overcome inherent random seeding of K-Means that results in an OA remaining in the same cluster group but being given a random cluster assignment (e.g. a number from 1 to 7) for every iteration. • Incorporate different data sources – Both Census and non-Census data
  • 27. UCL DEPARTMENT OF GEOGRAPHY What the Within-Cluster Sum of Squares Value means • Lower the mean value the more homogenous (i.e. better) the final cluster groupings are – Clustering using the lowest WCSS value can therefore be considered to create the optimum cluster groupings. • Using anything other than optimum cluster solution can have differing results depending on the dataset and level of geography
  • 28. UCL DEPARTMENT OF GEOGRAPHY
  • 29. UCL DEPARTMENT OF GEOGRAPHY
  • 30. UCL DEPARTMENT OF GEOGRAPHY
  • 31. UCL DEPARTMENT OF GEOGRAPHY
  • 32. UCL DEPARTMENT OF GEOGRAPHY Summary • The 2001 OAC was an important first step for open source geodemographics • The 2011 OAC can build on the successes of the 2001 OAC • Tools like GeodemCreator can be used to create bespoke geodemographic classifications easily and without any “expert” knowledge • The 2011 OAC is still in the planning phase but should be released in some form by late 2012/early 2013
  • 33. UCL DEPARTMENT OF GEOGRAPHY Any Questions?

Notes de l'éditeur

  1. Point 2 – i.e. birds of a feather flock together
  2. GeodemCreator created by Muhammad Adnan for his PhD – allows the creation of bespoke geodemographic classification.Does not use the same methodology as the 2001 OAC.
  3. 2001 OAC 94 Variables initially considered.Reduced to 41 Variables.Variables removed due to high correlation (and hence high redundancy) and other factors.Difficult to report on all reasons why variables were selected.Variable selection the only “black box” element of the 2001 OAC.Similar variables also used in Ward Level Classification to allow comparability with the 2001 OAC.Variables not weighted.
  4. Beyond 2011The 2011 Census may have been the last traditional Census so alternatives to using a decennial Census dataset need to be considered.How important a role will Open Data play in the current and future development of geodemographic classifications?