SlideShare une entreprise Scribd logo
1  sur  31
recommendations for
urban & transport contexts
      neal lathia (@neal_lathia)
      ucl media futures seminar
             may 25, 2011
research: personalisation
 to aide mobility in cities

              i.e.,
   getting from a to b (habit)
     finding z (discovery)
sensing mobility:
 5%-sample, 2 x 83-days
time-stamped location (entry, exit), modality
      payments (top-ups, travel cards)
        card-types (e.g., student)
what tools can we design to help travellers?
previously:
(getting from a to b) N. Lathia, J. Froehlich, L. Capra. Mining Public Transport Usage for
Personalised Intelligent Transport Systems. In IEEE ICDM 2010, Sydney, Australia.

(discovering z) D. Quercia, N. Lathia, F. Calabrese, G. Di Lorenzo, J. Crowcroft. Recommending
Social Events from Mobile Phone Location Data. In IEEE ICDM 2010, Sydney, Australia.
there is more to urban mobility than just moving.
context: purpose/intent, events, disruptions, cost, social connections.

for example,
mobility vs. cost
who are you?                how?




         when?



where         cash or travel card?
to?                                  how long for?
what
route?
N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers'
Spending on Public Transport. In ACM KDD 2011, San Diego, USA.

questions
(1) what is the relation between how we travel & how we spend?
(2) do travellers make the correct decisions? (no)
(3) can we help them with recommendations? (yes)
(%)          pay as you go purchases
                                         49.8         < 5 GBP
                                         24.2         5 – 10 GBP
                                         15.5         10 – 20 GBP
`
                                         (%)          travel card purchases
                                         70.8         7-day travel card
                                         15.8         1-month travel card
                                         11.6         7-day bus/tram pass

                               Purchase Behaviour
                  30
                                                              Travel
                  25                                          Cards
                                                              PAYG
                  20
    % Purchases




                  15


                  10


                   5


                   0
                       Mon   Tue   Wed    Thu   Fri     Sat      Sun
Purchase Geography                                Mobility Flow
45
                                                                                   Zone 1
40
                                          PAYG                                     Zone 2
                                          Travel Cards                             Zone 3
35
                                                                                   Zone 4
30                                                                                 Zone 5
                                                                                   Zone 6
25
                                                         arrive
20

15

10

 5                                                        depart
 0
     1   2   3       4    5    6      7       8     9
the data shows that:
(a) there is a high regularity in travel & purchase behaviour
(b) travellers buy in small increments and short-terms
(c) most purchases happen upon refused entry
(2) do travellers make the correct decisions?
compare actual purchases to the optimal (per traveller)

how:
(a) clean data
(b) build & search on a tree ~ sequence of choices
data cleaning overview




                          83-days   83-days


the “arrow” of time
data cleaning overview


                                    origin = destination




                             83-days                       83-days


the “arrow” of time


                      no purchase observed                 no purchase observed




                                                               -20% of users
how: build a tree with each user's mobility data
where a node is a purchase (expire, cost)
that is expanded when it has expired
(reduced) example:

               PAYG,             7-day             30-day
               £aa.aa            £bb.bb            £cc.cc
how: build a tree with each user's mobility data
where a node is a purchase (expire, cost)
that is expanded when it has expired
(reduced) example:

                PAYG,                   7-day                  30-day
                £aa.aa                  £bb.bb                 £cc.cc




      PAYG,              30-day
      £aa.aa   7-day     £cc.cc
               £bb.bb

                                     we reduce the space-complexity of
                                  searching on this tree by implementing
                                     expansion rules, pruning heuristics
the cheapest sequence of fares can then be
compared to what the user actually spent


               PAYG,
               £aa.aa




     PAYG,
     £aa.aa




     7-day
     £bb.bb




              30-day
              £cc.cc
the cheapest sequence of fares can then be
compared to what the user actually spent


               PAYG,
               £aa.aa


                        in each 83-day dataset, the 5% sample of users
                                 where overspending by ~ £2.5 million
     PAYG,
     £aa.aa                  An estimate of how much everybody (100%)
                        is overspending during an entire year (365 days)
                                                   is thus £200 million

     7-day
     £bb.bb




              30-day
              £cc.cc
overspending comes from
(a) failing to predict one's own mobility needs
                ...but we have observed that mobility is predictable

(b) failing to match mobility with fares (in a complex fare system)
                       ...which is an easy problem for a computer

can we help travellers?
recommender systems
aim to match users to items that will be of interest to them
recommender systems
aim to match users mobility profiles to items fares that will be of
interest the cheapest for them
two prediction problems
(in the paper) first, predict how a person is going to travel
(focus) then, predict what the cheapest fare will be
key fact:
the factors that influence cost on public transport are not
determined by your actual movements (a to b) but by generic
features that are city-dependent (e.g., Zone 1 - 2)
three steps
    1. for a given set of travel histories, compute the cheapest
fare (by tree expansion)
    2. reduce each travel history into a set of generic features,
describing the mobility (next slide)
    3. train classifiers to predict the cheapest fare given the set
of features
we have a set of {d, f, b, r, pt, ot, N} = F

                   where

              d = number of trips
           f = average trips per day
 b / r = proportion of trips on the bus / rail
pt / ot = proportion of peak & off-peak trips
             N = zone O-D matrix
           F = cheapest fare (label)
two baselines, three algorithms:
0. baseline – everyone on pay as you go
1. naïve bayes – estimating probabilities
2. k-nearest neighbours – looking at similar profiles
3. decision trees (C4.5) – recursively partitions data to infer rules
4. oracle – perfect knowledge
Accuracy (%)                     Savings (GBP)
              Dataset 1         Dataset 2      Dataset 1          Dataset 2
Baseline           74.99             76.91       326,447.95         306,145.85
Naïve Bayes        77.46             80.71       393,585.81         369,232.24
k-NN (5)           96.74             97.09       465,822.17         426,375.85
C4.5               98.01             98.29       473,918.38         434,082.81
Oracle             100                   100     479,583.91         438,923.30
N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers'
Spending on Public Transport. In ACM KDD 2011, San Diego, USA.

questions
(1) what is the relation between how we travel & how we spend?
(2) do travellers make the correct decisions? (no)
(3) can we help them with recommendations? (yes)
recommendations for
urban & transport contexts
      neal lathia (@neal_lathia)
      ucl media futures seminar
             may 25, 2011

Contenu connexe

Similaire à Mobility Mining for Fare Recommendation

UCL Bite-Sized Lunch Lecture
UCL Bite-Sized Lunch LectureUCL Bite-Sized Lunch Lecture
UCL Bite-Sized Lunch LectureNeal Lathia
 
Volunteered Geographic Information and OpenStreetMap
Volunteered Geographic Information and OpenStreetMapVolunteered Geographic Information and OpenStreetMap
Volunteered Geographic Information and OpenStreetMapchippy
 
Webinar: Using smart card and GPS data for policy and planning: the case of T...
Webinar: Using smart card and GPS data for policy and planning: the case of T...Webinar: Using smart card and GPS data for policy and planning: the case of T...
Webinar: Using smart card and GPS data for policy and planning: the case of T...BRTCoE
 
Rides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA BikeRides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA BikeIRJET Journal
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...Ravi Kiran B.
 
IRJET - Cardless ATM
IRJET -  	  Cardless ATMIRJET -  	  Cardless ATM
IRJET - Cardless ATMIRJET Journal
 
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLINGCREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLINGIRJET Journal
 
CNN MODEL FOR TRAFFIC SIGN RECOGNITION
CNN MODEL FOR TRAFFIC SIGN RECOGNITIONCNN MODEL FOR TRAFFIC SIGN RECOGNITION
CNN MODEL FOR TRAFFIC SIGN RECOGNITIONIRJET Journal
 
Safeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity ModelingSafeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity ModelingParang Saraf
 
Turning Oyster Cards into Information
Turning Oyster Cards into InformationTurning Oyster Cards into Information
Turning Oyster Cards into InformationNeal Lathia
 
Stefan Michalak - Portfolio - January 2016
Stefan Michalak - Portfolio - January 2016Stefan Michalak - Portfolio - January 2016
Stefan Michalak - Portfolio - January 2016stefan michalak
 
A car sharing auction with temporal-spatial OD connection conditions
A car sharing auction with temporal-spatial OD connection conditionsA car sharing auction with temporal-spatial OD connection conditions
A car sharing auction with temporal-spatial OD connection conditionsharapon
 
Le rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économiqueLe rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économiqueCARTO
 
Cities2.0 Ict2008 Daniel Kaplan
Cities2.0 Ict2008 Daniel KaplanCities2.0 Ict2008 Daniel Kaplan
Cities2.0 Ict2008 Daniel KaplanACIDD
 
CARLI Usage Stats Keynote 20130325
CARLI Usage Stats Keynote 20130325CARLI Usage Stats Keynote 20130325
CARLI Usage Stats Keynote 20130325Jason Price, PhD
 
ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)
ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)
ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)Harry Zhang
 
Shortest Path Search with pgRouting
Shortest Path Search with pgRoutingShortest Path Search with pgRouting
Shortest Path Search with pgRoutingFOSS4G 2011
 
Shortest Path search for real road networks with pgRouting
Shortest Path search for real road networks with pgRoutingShortest Path search for real road networks with pgRouting
Shortest Path search for real road networks with pgRoutingDaniel Kastl
 
Using geobrowsers for thematic mapping
Using geobrowsers for thematic mappingUsing geobrowsers for thematic mapping
Using geobrowsers for thematic mappingBjorn Sandvik
 

Similaire à Mobility Mining for Fare Recommendation (20)

UCL Bite-Sized Lunch Lecture
UCL Bite-Sized Lunch LectureUCL Bite-Sized Lunch Lecture
UCL Bite-Sized Lunch Lecture
 
Volunteered Geographic Information and OpenStreetMap
Volunteered Geographic Information and OpenStreetMapVolunteered Geographic Information and OpenStreetMap
Volunteered Geographic Information and OpenStreetMap
 
Webinar: Using smart card and GPS data for policy and planning: the case of T...
Webinar: Using smart card and GPS data for policy and planning: the case of T...Webinar: Using smart card and GPS data for policy and planning: the case of T...
Webinar: Using smart card and GPS data for policy and planning: the case of T...
 
Rides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA BikeRides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA Bike
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
 
IRJET - Cardless ATM
IRJET -  	  Cardless ATMIRJET -  	  Cardless ATM
IRJET - Cardless ATM
 
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLINGCREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
CREDIT CARD FRAUD DETECTION USING PREDICTIVE MODELLING
 
CNN MODEL FOR TRAFFIC SIGN RECOGNITION
CNN MODEL FOR TRAFFIC SIGN RECOGNITIONCNN MODEL FOR TRAFFIC SIGN RECOGNITION
CNN MODEL FOR TRAFFIC SIGN RECOGNITION
 
Safeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity ModelingSafeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity Modeling
 
Turning Oyster Cards into Information
Turning Oyster Cards into InformationTurning Oyster Cards into Information
Turning Oyster Cards into Information
 
Stefan Michalak - Portfolio - January 2016
Stefan Michalak - Portfolio - January 2016Stefan Michalak - Portfolio - January 2016
Stefan Michalak - Portfolio - January 2016
 
A car sharing auction with temporal-spatial OD connection conditions
A car sharing auction with temporal-spatial OD connection conditionsA car sharing auction with temporal-spatial OD connection conditions
A car sharing auction with temporal-spatial OD connection conditions
 
Le rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économiqueLe rôle de l’intelligence géospatiale dans la reprise économique
Le rôle de l’intelligence géospatiale dans la reprise économique
 
Cities2.0 Ict2008 Daniel Kaplan
Cities2.0 Ict2008 Daniel KaplanCities2.0 Ict2008 Daniel Kaplan
Cities2.0 Ict2008 Daniel Kaplan
 
CARLI Usage Stats Keynote 20130325
CARLI Usage Stats Keynote 20130325CARLI Usage Stats Keynote 20130325
CARLI Usage Stats Keynote 20130325
 
SSRN-id2718694
SSRN-id2718694SSRN-id2718694
SSRN-id2718694
 
ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)
ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)
ZhangTorkkolaLiSchreinerZhangGardnerZhao(04279048)
 
Shortest Path Search with pgRouting
Shortest Path Search with pgRoutingShortest Path Search with pgRouting
Shortest Path Search with pgRouting
 
Shortest Path search for real road networks with pgRouting
Shortest Path search for real road networks with pgRoutingShortest Path search for real road networks with pgRouting
Shortest Path search for real road networks with pgRouting
 
Using geobrowsers for thematic mapping
Using geobrowsers for thematic mappingUsing geobrowsers for thematic mapping
Using geobrowsers for thematic mapping
 

Plus de Neal Lathia

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Neal Lathia
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Neal Lathia
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer supportNeal Lathia
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions FasterNeal Lathia
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, FasterNeal Lathia
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised ExperiencesNeal Lathia
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelNeal Lathia
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineNeal Lathia
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product ManagersNeal Lathia
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Neal Lathia
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital HealthNeal Lathia
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataNeal Lathia
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self MeetupNeal Lathia
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealthNeal Lathia
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Neal Lathia
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentNeal Lathia
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Neal Lathia
 
The Ubhave Framework
The Ubhave FrameworkThe Ubhave Framework
The Ubhave FrameworkNeal Lathia
 
Contextual Dissonance: Design Bias in Sensor-Based Experience Sampling Methods
Contextual Dissonance: Design Bias in Sensor-Based Experience Sampling MethodsContextual Dissonance: Design Bias in Sensor-Based Experience Sampling Methods
Contextual Dissonance: Design Bias in Sensor-Based Experience Sampling MethodsNeal Lathia
 
The Ubhave Project (Part 1/2)
The Ubhave Project (Part 1/2)The Ubhave Project (Part 1/2)
The Ubhave Project (Part 1/2)Neal Lathia
 

Plus de Neal Lathia (20)

Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)Everything around the NLP (London.AI Feb 2021)
Everything around the NLP (London.AI Feb 2021)
 
Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)Using machine learning for customer service (Data Talks Club)
Using machine learning for customer service (Data Talks Club)
 
Using language models to supercharge Monzo’s customer support
 Using language models to supercharge Monzo’s customer support Using language models to supercharge Monzo’s customer support
Using language models to supercharge Monzo’s customer support
 
Making Better Decisions Faster
Making Better Decisions FasterMaking Better Decisions Faster
Making Better Decisions Faster
 
Machine Learning, Faster
Machine Learning, FasterMachine Learning, Faster
Machine Learning, Faster
 
AI & Personalised Experiences
AI & Personalised ExperiencesAI & Personalised Experiences
AI & Personalised Experiences
 
Opportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised TravelOpportunities & Challenges in Personalised Travel
Opportunities & Challenges in Personalised Travel
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation Engine
 
Machine Learning for Product Managers
Machine Learning for Product ManagersMachine Learning for Product Managers
Machine Learning for Product Managers
 
Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)Mining Smartphone Data (with Python)
Mining Smartphone Data (with Python)
 
Data Science in Digital Health
Data Science in Digital HealthData Science in Digital Health
Data Science in Digital Health
 
Analysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone DataAnalysing Daily Behaviours with Large-Scale Smartphone Data
Analysing Daily Behaviours with Large-Scale Smartphone Data
 
Cambridge Quantified Self Meetup
Cambridge Quantified Self MeetupCambridge Quantified Self Meetup
Cambridge Quantified Self Meetup
 
Data Science in #mHealth
Data Science in #mHealthData Science in #mHealth
Data Science in #mHealth
 
Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport Tube Star: Crowd-Sourced Experiences on Public Transport
Tube Star: Crowd-Sourced Experiences on Public Transport
 
Emotion Sense: From Design to Deployment
Emotion Sense: From Design to DeploymentEmotion Sense: From Design to Deployment
Emotion Sense: From Design to Deployment
 
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
Opportunities and Challenges of Using Smartphones for Health Monitoring and I...
 
The Ubhave Framework
The Ubhave FrameworkThe Ubhave Framework
The Ubhave Framework
 
Contextual Dissonance: Design Bias in Sensor-Based Experience Sampling Methods
Contextual Dissonance: Design Bias in Sensor-Based Experience Sampling MethodsContextual Dissonance: Design Bias in Sensor-Based Experience Sampling Methods
Contextual Dissonance: Design Bias in Sensor-Based Experience Sampling Methods
 
The Ubhave Project (Part 1/2)
The Ubhave Project (Part 1/2)The Ubhave Project (Part 1/2)
The Ubhave Project (Part 1/2)
 

Dernier

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Dernier (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Mobility Mining for Fare Recommendation

  • 1. recommendations for urban & transport contexts neal lathia (@neal_lathia) ucl media futures seminar may 25, 2011
  • 2. research: personalisation to aide mobility in cities i.e., getting from a to b (habit) finding z (discovery)
  • 3.
  • 4.
  • 5. sensing mobility: 5%-sample, 2 x 83-days time-stamped location (entry, exit), modality payments (top-ups, travel cards) card-types (e.g., student)
  • 6. what tools can we design to help travellers? previously: (getting from a to b) N. Lathia, J. Froehlich, L. Capra. Mining Public Transport Usage for Personalised Intelligent Transport Systems. In IEEE ICDM 2010, Sydney, Australia. (discovering z) D. Quercia, N. Lathia, F. Calabrese, G. Di Lorenzo, J. Crowcroft. Recommending Social Events from Mobile Phone Location Data. In IEEE ICDM 2010, Sydney, Australia.
  • 7. there is more to urban mobility than just moving. context: purpose/intent, events, disruptions, cost, social connections. for example,
  • 9. who are you? how? when? where cash or travel card? to? how long for? what route?
  • 10. N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers' Spending on Public Transport. In ACM KDD 2011, San Diego, USA. questions (1) what is the relation between how we travel & how we spend? (2) do travellers make the correct decisions? (no) (3) can we help them with recommendations? (yes)
  • 11. (%) pay as you go purchases 49.8 < 5 GBP 24.2 5 – 10 GBP 15.5 10 – 20 GBP ` (%) travel card purchases 70.8 7-day travel card 15.8 1-month travel card 11.6 7-day bus/tram pass Purchase Behaviour 30 Travel 25 Cards PAYG 20 % Purchases 15 10 5 0 Mon Tue Wed Thu Fri Sat Sun
  • 12. Purchase Geography Mobility Flow 45 Zone 1 40 PAYG Zone 2 Travel Cards Zone 3 35 Zone 4 30 Zone 5 Zone 6 25 arrive 20 15 10 5 depart 0 1 2 3 4 5 6 7 8 9
  • 13. the data shows that: (a) there is a high regularity in travel & purchase behaviour (b) travellers buy in small increments and short-terms (c) most purchases happen upon refused entry
  • 14. (2) do travellers make the correct decisions? compare actual purchases to the optimal (per traveller) how: (a) clean data (b) build & search on a tree ~ sequence of choices
  • 15. data cleaning overview 83-days 83-days the “arrow” of time
  • 16. data cleaning overview origin = destination 83-days 83-days the “arrow” of time no purchase observed no purchase observed -20% of users
  • 17. how: build a tree with each user's mobility data where a node is a purchase (expire, cost) that is expanded when it has expired (reduced) example: PAYG, 7-day 30-day £aa.aa £bb.bb £cc.cc
  • 18. how: build a tree with each user's mobility data where a node is a purchase (expire, cost) that is expanded when it has expired (reduced) example: PAYG, 7-day 30-day £aa.aa £bb.bb £cc.cc PAYG, 30-day £aa.aa 7-day £cc.cc £bb.bb we reduce the space-complexity of searching on this tree by implementing expansion rules, pruning heuristics
  • 19. the cheapest sequence of fares can then be compared to what the user actually spent PAYG, £aa.aa PAYG, £aa.aa 7-day £bb.bb 30-day £cc.cc
  • 20. the cheapest sequence of fares can then be compared to what the user actually spent PAYG, £aa.aa in each 83-day dataset, the 5% sample of users where overspending by ~ £2.5 million PAYG, £aa.aa An estimate of how much everybody (100%) is overspending during an entire year (365 days) is thus £200 million 7-day £bb.bb 30-day £cc.cc
  • 21. overspending comes from (a) failing to predict one's own mobility needs ...but we have observed that mobility is predictable (b) failing to match mobility with fares (in a complex fare system) ...which is an easy problem for a computer can we help travellers?
  • 22. recommender systems aim to match users to items that will be of interest to them
  • 23. recommender systems aim to match users mobility profiles to items fares that will be of interest the cheapest for them
  • 24. two prediction problems (in the paper) first, predict how a person is going to travel (focus) then, predict what the cheapest fare will be
  • 25. key fact: the factors that influence cost on public transport are not determined by your actual movements (a to b) but by generic features that are city-dependent (e.g., Zone 1 - 2)
  • 26. three steps 1. for a given set of travel histories, compute the cheapest fare (by tree expansion) 2. reduce each travel history into a set of generic features, describing the mobility (next slide) 3. train classifiers to predict the cheapest fare given the set of features
  • 27. we have a set of {d, f, b, r, pt, ot, N} = F where d = number of trips f = average trips per day b / r = proportion of trips on the bus / rail pt / ot = proportion of peak & off-peak trips N = zone O-D matrix F = cheapest fare (label)
  • 28. two baselines, three algorithms: 0. baseline – everyone on pay as you go 1. naïve bayes – estimating probabilities 2. k-nearest neighbours – looking at similar profiles 3. decision trees (C4.5) – recursively partitions data to infer rules 4. oracle – perfect knowledge
  • 29. Accuracy (%) Savings (GBP) Dataset 1 Dataset 2 Dataset 1 Dataset 2 Baseline 74.99 76.91 326,447.95 306,145.85 Naïve Bayes 77.46 80.71 393,585.81 369,232.24 k-NN (5) 96.74 97.09 465,822.17 426,375.85 C4.5 98.01 98.29 473,918.38 434,082.81 Oracle 100 100 479,583.91 438,923.30
  • 30. N. Lathia, L. Capra. Mining Mobility Data to Minimise Travellers' Spending on Public Transport. In ACM KDD 2011, San Diego, USA. questions (1) what is the relation between how we travel & how we spend? (2) do travellers make the correct decisions? (no) (3) can we help them with recommendations? (yes)
  • 31. recommendations for urban & transport contexts neal lathia (@neal_lathia) ucl media futures seminar may 25, 2011