SlideShare a Scribd company logo
1 of 56
@neal_lathia
computer laboratory: university of cambridge
online   offline
urban
        data mining
 web




        urbanmining.wordpress.com
online
user data + algorithms → relevance ☺
public transport


user data + algorithms → relevance
“smart” cards
1 facilitate payment
2 collect user data
“smart” cards
time-stamped locations,
modality, payments,
user categories


anonymised with
persistent user ids
“smart” cards datasets
100% - 1 month
~5.1 million people
~78.8 million trips

5% - 2 x 83 days
~300k people
~7.7 million trips
Purchase Geography                                   Mobility Flow
45
                                                                                      Zone 1
                                          PAYG                                        Zone 2
40
                                          Travel Cards                                Zone 3
35                                                                                    Zone 4
                                                                                      Zone 5
30                                                                                    Zone 6
25

20

15

10

5                                                            arrive
0
     1   2   3       4    5    6      7        8         9
using transport data for...

    1 predicting disruption relevance
    2 personalised travel time
    3 fare purchase recommendation
can we use transport data for...

    1 predicting disruption relevance
      i.e., rank station importance correctly?
can we use transport data for...

       predicting disruption relevance
       i.e., rank station importance correctly?
       (where you will go in the future)
percentile ranking

0.0 (best)
…
0.5 (random)
…
1.0 (inverse)
percentile ranking

0.0 (best)
...
0.25 (rank stations by popularity)
...
0.5 (random)
…
1.0 (inverse)
percentile ranking

0.0 (best)
...
0.06 (factor in user's history)
...
0.25 (rank stations by popularity)
...
0.5 (random)
…
1.0 (inverse)
percentile ranking

0.0 (best)
…
0.05 (“those who touch in here also touch in at...”)
...
0.06 (factor in user's history)
...
0.25 (rank stations by popularity)
...
0.5 (random)
…
1.0 (inverse)
accurate ranking without

    1 explicitly asking
    2 network topology, rail schedule
using transport data for...

    1 predicting disruption relevance
    2 personalised travel time
can we use transport data for...

    2 predict your travel time
      i.e., time between touch in/out?
mean absolute error (minutes)

0.0 (best)
…
mean absolute error (minutes)

0.0 (best)
…
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.30 (mean time)
...
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.28 (“people who travel at this time...”)
3.30 (mean time)
...
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.17 (“people who are as familiar as you...”)
3.28 (“people who travel at this time...”)
3.30 (mean time)
...
9.82 (time tabled)
mean absolute error (minutes)

0.0 (best)
…
3.13 (“your trips in the past...”)
3.17 (“people who are as familiar as you...”)
3.28 (“people who travel at this time...”)
3.30 (mean time)
...
9.82 (time tabled)
accurate predictions without

    1 explicitly asking
    2 network topology, rail schedule
    3 ongoing disruptions, delays
using transport data for...

    1 predicting disruption relevance
    2 personalised travel time
    3 fare purchase recommendation
30
                                                                                    Purchase Behaviour
                                                                                                            Travel Cards
                                                                   25
                                                                                                            PAYG


                                                                   20




                                                     % Purchases
                                                                   15



                                                                   10



                                                                   5



                                                                   0
                                                                        Mon   Tue       Wed    Thu    Fri   Sat      Sun




45
             Purchase Geography
                                                                                      Mobility Flow
40
                                      PAYG                                                                        Zone 1
                                      Travel Cards                                                                Zone 2
35                                                   arrive                                                       Zone 3
30                                                                                                                Zone 4
                                                                                                                  Zone 5
25                                                                                                                Zone 6
20

15

10

5

0
     1   2   3   4    5    6      7       8     9
(a) high regularity in purchases & movements
(b) small increments, short terms
(c) purchase on refused entry?
are people making the right choice?
£200 million
     overspend
(a) failure to predict your movements
(b) failing to match mobility with fares
can we use transport data for...

    3 predict the fares you should buy
      i.e., what will be cheapest?
classification accuracy

0.0% (worst)
...
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
...
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
80% naïve bayes
...
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
80% naïve bayes
…
97% (“people like you should have bought...”)
100% (oracle)
classification accuracy

0.0 (worst)
…
77% everyone on pay as you go
80% naïve bayes
…
97% (“people like you should have bought...”)
98% decision trees
100% (oracle)
money saved

£0.0 (worst)
…
£326,447.95 everyone on pay as you go
£393,585.81 naïve bayes
…
£465,822.17 (“people like you...”)
£473,918.38 decision trees
£479,583.91 (oracle)
“smart” cards
1 facilitate payment
2 collect user data

3 enable powerful,
  personalised
  information systems
using transport data for...

    1 behaviours ~ policy & incentives
    2 community well-being
References
N. Lathia, J. Froehlich, L. Capra. Mining Public Transport Usage for Personalised Intelligent
Transport Systems. In IEEE International Conference on Data Mining. December 2010, Sydney,
Australia.

N. Lathia, C. Smith, J. Froehlich, L. Capra. Individuals Among Commuters: Building
Personalised Transport Information Systems from Fare Collection Systems. Under submission.

N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers' Spending on Public Transport.
In ACM International Conference on Knowledge Discovery and Data Mining. August 2011. San
Diego, USA.

N. Lathia, L. Capra. How Smart is Your Smart Card? Measuring Travel Behaviours,
Perceptions, and Incentives. In ACM International Conference on Ubiquitous Computing.
September 2011. Beijing, China.

N. Lathia, D. Quercia, J. Crowcroft. The Hidden Image of the City: Sensing Community Well-
Being from Urban Mobility. To Appear, 10th International Conference on Pervasive Computing.
June 2012. Newcastle, UK.

More Related Content

Viewers also liked

Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014 Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014 Ameriabank
 
la comunicacion
la comunicacionla comunicacion
la comunicacion26008733
 
issb experience of one student
issb experience of one studentissb experience of one student
issb experience of one studentOmair Ayaz
 
Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17Paula Nottingham
 
Paris Redis Meetup Introduction
Paris Redis Meetup IntroductionParis Redis Meetup Introduction
Paris Redis Meetup IntroductionGregory Boissinot
 
PSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSBPSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSBOmair Ayaz
 
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...Jasper Moelker
 

Viewers also liked (9)

Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014 Ameria Group: Investor Relations Presentation Q2 2014
Ameria Group: Investor Relations Presentation Q2 2014
 
Pmc profile 22032016
Pmc profile 22032016Pmc profile 22032016
Pmc profile 22032016
 
la comunicacion
la comunicacionla comunicacion
la comunicacion
 
issb experience of one student
issb experience of one studentissb experience of one student
issb experience of one student
 
Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17Final m3 online session 1 wbs3760 24.2.17
Final m3 online session 1 wbs3760 24.2.17
 
Paris Redis Meetup Introduction
Paris Redis Meetup IntroductionParis Redis Meetup Introduction
Paris Redis Meetup Introduction
 
PSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSBPSYCHOLOGICAL TESTS AT ISSB
PSYCHOLOGICAL TESTS AT ISSB
 
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
CSI.SP: Valuating The Informal Built Environment by Daan van den Berg, Jasper...
 
Podocarpus
PodocarpusPodocarpus
Podocarpus
 

More from Neal Lathia

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Neal Lathia
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Neal Lathia
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer supportNeal Lathia
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions FasterNeal Lathia
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, FasterNeal Lathia
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised ExperiencesNeal Lathia
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelNeal Lathia
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineNeal Lathia
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product ManagersNeal Lathia
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Neal Lathia
 
Happier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataHappier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataNeal Lathia
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital HealthNeal Lathia
 
Using Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeUsing Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeNeal Lathia
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataNeal Lathia
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self MeetupNeal Lathia
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealthNeal Lathia
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Neal Lathia
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentNeal Lathia
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Neal Lathia
 
Using Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeUsing Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeNeal Lathia
 

More from Neal Lathia (20)

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer support
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions Faster
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, Faster
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised Experiences
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised Travel
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation Engine
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product Managers
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)
 
Happier and Healthier with Smartphone Data
Happier and Healthier with Smartphone DataHappier and Healthier with Smartphone Data
Happier and Healthier with Smartphone Data
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital Health
 
Using Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily LifeUsing Smartphones to Measure (and Intervene in) Daily Life
Using Smartphones to Measure (and Intervene in) Daily Life
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone Data
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self Meetup
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealth
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to Deployment
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
 
Using Smartphones to Research Daily Life
Using Smartphones to Research Daily LifeUsing Smartphones to Research Daily Life
Using Smartphones to Research Daily Life
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Turning Oyster Cards into Information

  • 2. online offline
  • 3. urban data mining web urbanmining.wordpress.com
  • 4.
  • 5.
  • 6.
  • 7. online user data + algorithms → relevance ☺
  • 8.
  • 9.
  • 10.
  • 11. public transport user data + algorithms → relevance
  • 12. “smart” cards 1 facilitate payment 2 collect user data
  • 13. “smart” cards time-stamped locations, modality, payments, user categories anonymised with persistent user ids
  • 14. “smart” cards datasets 100% - 1 month ~5.1 million people ~78.8 million trips 5% - 2 x 83 days ~300k people ~7.7 million trips
  • 15.
  • 16.
  • 17.
  • 18. Purchase Geography Mobility Flow 45 Zone 1 PAYG Zone 2 40 Travel Cards Zone 3 35 Zone 4 Zone 5 30 Zone 6 25 20 15 10 5 arrive 0 1 2 3 4 5 6 7 8 9
  • 19. using transport data for... 1 predicting disruption relevance 2 personalised travel time 3 fare purchase recommendation
  • 20.
  • 21. can we use transport data for... 1 predicting disruption relevance i.e., rank station importance correctly?
  • 22. can we use transport data for... predicting disruption relevance i.e., rank station importance correctly? (where you will go in the future)
  • 23. percentile ranking 0.0 (best) … 0.5 (random) … 1.0 (inverse)
  • 24. percentile ranking 0.0 (best) ... 0.25 (rank stations by popularity) ... 0.5 (random) … 1.0 (inverse)
  • 25. percentile ranking 0.0 (best) ... 0.06 (factor in user's history) ... 0.25 (rank stations by popularity) ... 0.5 (random) … 1.0 (inverse)
  • 26. percentile ranking 0.0 (best) … 0.05 (“those who touch in here also touch in at...”) ... 0.06 (factor in user's history) ... 0.25 (rank stations by popularity) ... 0.5 (random) … 1.0 (inverse)
  • 27. accurate ranking without 1 explicitly asking 2 network topology, rail schedule
  • 28. using transport data for... 1 predicting disruption relevance 2 personalised travel time
  • 29.
  • 30.
  • 31.
  • 32. can we use transport data for... 2 predict your travel time i.e., time between touch in/out?
  • 33. mean absolute error (minutes) 0.0 (best) …
  • 34. mean absolute error (minutes) 0.0 (best) … 9.82 (time tabled)
  • 35. mean absolute error (minutes) 0.0 (best) … 3.30 (mean time) ... 9.82 (time tabled)
  • 36. mean absolute error (minutes) 0.0 (best) … 3.28 (“people who travel at this time...”) 3.30 (mean time) ... 9.82 (time tabled)
  • 37. mean absolute error (minutes) 0.0 (best) … 3.17 (“people who are as familiar as you...”) 3.28 (“people who travel at this time...”) 3.30 (mean time) ... 9.82 (time tabled)
  • 38. mean absolute error (minutes) 0.0 (best) … 3.13 (“your trips in the past...”) 3.17 (“people who are as familiar as you...”) 3.28 (“people who travel at this time...”) 3.30 (mean time) ... 9.82 (time tabled)
  • 39. accurate predictions without 1 explicitly asking 2 network topology, rail schedule 3 ongoing disruptions, delays
  • 40. using transport data for... 1 predicting disruption relevance 2 personalised travel time 3 fare purchase recommendation
  • 41. 30 Purchase Behaviour Travel Cards 25 PAYG 20 % Purchases 15 10 5 0 Mon Tue Wed Thu Fri Sat Sun 45 Purchase Geography Mobility Flow 40 PAYG Zone 1 Travel Cards Zone 2 35 arrive Zone 3 30 Zone 4 Zone 5 25 Zone 6 20 15 10 5 0 1 2 3 4 5 6 7 8 9
  • 42. (a) high regularity in purchases & movements (b) small increments, short terms (c) purchase on refused entry?
  • 43. are people making the right choice?
  • 44. £200 million overspend
  • 45. (a) failure to predict your movements (b) failing to match mobility with fares
  • 46. can we use transport data for... 3 predict the fares you should buy i.e., what will be cheapest?
  • 48. classification accuracy 0.0 (worst) … 77% everyone on pay as you go ... 100% (oracle)
  • 49. classification accuracy 0.0 (worst) … 77% everyone on pay as you go 80% naïve bayes ... 100% (oracle)
  • 50. classification accuracy 0.0 (worst) … 77% everyone on pay as you go 80% naïve bayes … 97% (“people like you should have bought...”) 100% (oracle)
  • 51. classification accuracy 0.0 (worst) … 77% everyone on pay as you go 80% naïve bayes … 97% (“people like you should have bought...”) 98% decision trees 100% (oracle)
  • 52. money saved £0.0 (worst) … £326,447.95 everyone on pay as you go £393,585.81 naïve bayes … £465,822.17 (“people like you...”) £473,918.38 decision trees £479,583.91 (oracle)
  • 53. “smart” cards 1 facilitate payment 2 collect user data 3 enable powerful, personalised information systems
  • 54.
  • 55. using transport data for... 1 behaviours ~ policy & incentives 2 community well-being
  • 56. References N. Lathia, J. Froehlich, L. Capra. Mining Public Transport Usage for Personalised Intelligent Transport Systems. In IEEE International Conference on Data Mining. December 2010, Sydney, Australia. N. Lathia, C. Smith, J. Froehlich, L. Capra. Individuals Among Commuters: Building Personalised Transport Information Systems from Fare Collection Systems. Under submission. N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers' Spending on Public Transport. In ACM International Conference on Knowledge Discovery and Data Mining. August 2011. San Diego, USA. N. Lathia, L. Capra. How Smart is Your Smart Card? Measuring Travel Behaviours, Perceptions, and Incentives. In ACM International Conference on Ubiquitous Computing. September 2011. Beijing, China. N. Lathia, D. Quercia, J. Crowcroft. The Hidden Image of the City: Sensing Community Well- Being from Urban Mobility. To Appear, 10th International Conference on Pervasive Computing. June 2012. Newcastle, UK.