4. 1. Parking Data
What?
mobile cameras, cell phonespolicies, mapsin-street sensors
traffic flow special events, …satellites (pollution) surveys
payments, violations
5. 1. Parking Data
Why?
Use cases
1. set parking policy: prices, demarcation, …
2. guide enforcement officers to offenders
3. guide drivers to the best vacancies
Value
1. less time wasted finding a space
2. less pollution, better health
3. access to local businesses
4. fair-and-transparent
Challenge
maximise the utility (value minus cost) of the data
6. 1. Parking Data
Who?
Many cities have deployed smart parking systems since 2010
San Francisco, London, Moscow, …
Our contributions
Since 2012, we have deployed 6 new technologies in 3 major cities
Los Angeles. Basic pricing method, time-of-week subdivision, real-time pricing as
rate, optimal learning while selling, surveys, effectiveness evaluation, non-payment
evaluation (and not deployed: real-time parking guidance, real-time pricing as integral)
Washington DC. Spatiotemporal sampling: sensor allocation and reconstruction,
spatial queueing for demarcation decisions
Berkeley. Fusion of temporal sampling with payment data
Awards for this work included ITS Innovation, IPI Innovation and MIT Top-50
8. Zoeter et al , New Algorithms for Parking Demand Management. Proc. 20th ACM KDD, 2014
Glasnapp et al, Understanding Dynamic Pricing for Parking in Los Angeles. Intl. Conf. HCI in Business, 2014
2. Pricing
Context. Demand-Based Pricing
9. 2. Pricing
Problem
Learn on-street parking prices to make the city happier
⇒ maximize the rate at which people get value from the system (not revenue)
Challenges
1. Model or forecast for value when driver behavior varies in 5 dimensions
frequency, location, arrival, duration, legality-fraction
2. Ensure simplicity so drivers remember and city official can explain
in the face of huge variations in demand in space and time
3. How big should price increments be?
too large ⇒ prices might oscillate from month-to-month
too small ⇒ system may have no useful effect
10. 2. Pricing
Model for Value
Goal. Choose appropriate reward function
If more people are parked, then
• more people get value
• but the distance to a space increases
For a geometric distribution of vacancies
distance to a space =
𝑓
1−𝑓
where the occupancy fraction is 𝑓
But this is singular as 𝑓 → 1!
occupancy fraction
mean spaces to first vacancy
- geometric distribution
11. 2. Pricing
Distance to a Space
The singularity is unrealistic as occupancy
fractions vary spatially
spatial autocorrelation of
occupancy fraction (LA 2012 data)
mean spaces to first vacancy
- geometric distribution
- real data
occupancy fraction
12. 2. Pricing
Simple Valuation Model
occupancy fraction
Gradient Ascent
Move up the gradient of the total valuation
w.r.t. price 𝒑, so
new-price – old-price
∼
𝝏
𝝏𝒑
𝓡
valuation−rate 𝒑, 𝒕, 𝒙 𝒅𝒙 𝒅𝒕
⇒ simple for machine-learning scientists,
but NOT for citizens and officials...
valuation rate (per space, per unit time)
= constant per person parked
- k × distance travelled per arrival
13. 2. Pricing
Simple Valuation Model
occupancy fraction
⇒ simple for machine-learning scientists,
but NOT for citizens and officials…
valuation rate (per space, per unit time)
= constant per person parked
- k × distance travelled per arrival
Towards a simpler rule
Maximizing the black curve is nearly the
same as maximizing the red curve, whose
gradients are -1, 0 or 1
approximation
Gradient Ascent
Move up the gradient of the total valuation
w.r.t. price 𝒑, so
new-price – old-price
∼
𝝏
𝝏𝒑
𝓡
valuation−rate 𝒑, 𝒕, 𝒙 𝒅𝒙 𝒅𝒕
14. 2. Pricing
Voting Rule
Voting Rule
If 𝑯 − 𝑳 > 0.3 then increase the price
If 𝑳 − 𝑯 > 0.3 then decrease the price
Prices are on a ladder $0.5, $1, $1.50, $2, $3, …, $7 per hour
Definition
The high vote, 𝑯, is the fraction of time that the system is over 90% occupied
The low vote, 𝑳, is the fraction of time that the system is under 70% occupied
15. 2. Pricing (A Tale of Two Cities)
Comparison
Average Occupancy Rule (used in San Francisco)
If average occupancy fraction > 𝟎. 𝟖 then increase the price by $0.25
If average occupancy fraction < 𝟎. 𝟔 then decrease the price by $0.25
average occupancy fraction 0.67 0.67
average occupancy rule price same price same
voting rule price down price up
mean distance to vacancy 2 spaces 11 spaces
occupancy
fraction
Scenario A Scenario B
16. 2. Pricing - How did prices ($/HR) change?
Area is proportional to the sum over spaces of the
number of hours at the given price
Before (1st June 2012) After (1st January 2013)
$5
$4
$3
$2
$150
$1
$050
$4
$3
$2
$1
50% of prices decreased
yet revenue increased by 12%
17. 2. Pricing
Does it work?
Price increase from $4/hour to $5/hour
No vacancy = full (red)
More availability (yellow and green)
Occupancy time series for 701 South Olive Street, one row = one weekday (Mon-Fri not Sat-Sun)
Underused (blue)
… but is this
• the impact of the price change
• a change in sensor signal-processing
• a lucky coincidence
• something else?
19. Handicapped placards
• 80% of parking in highly-congested areas now goes for free to handicapped placard users
• Maybe, many such drivers use placards illegally
• Law-changing takes time, but is ongoing
Minimum price
• The minimum acceptable price ($0.5 per hour) was already reached in under-used areas
• But occupancy there has continued to increase substantially
• It’s hard to distinguish between economic improvement and long-term price-change latency
Political acceptance
Unlike SFPark, LA ExpressPark receives positive press coverage and continues to expand today
Clinchant et al, Using Analytics to Understand On-Street Parking. Proc. 22nd World Congress on ITS, 2015
2. Pricing
Does it work? Results for 2013-16
21. 3. Sampling
Motivation
Solutions
1. Spatial sampling:
don’t observe all stalls
2. Temporal sampling:
don’t observe all the time
Deployed in Washington DC and
Berkeley
Problem
LA’s sensors are too
expensive. Can we
combine sensing
methods to ensure high-
quality data while saving
90% of the costs?
Dance, Lean Smart Parking. The Parking Professional, vol. 30, no. 6, 2014
22. 3. Sampling
Problem P1
Informally
Given
• a normally-distributed discrete-time time-series
• noisy measurements that come at a cost
Question
When should you measure so as to minimise the cost
of prediction errors plus measurement costs?
black = true time-series (unobserved)
red = forecast standard-deviation
blue = costly measurements
Formally
Time-series
𝑿 𝒕+𝟏 = 𝑨𝑿 𝒕 + 𝑵(𝟎, 𝑸)
Actions
𝑎 𝑡 = 1 for good measurement, cost 𝑐
𝑎 𝑡 = 0 for poor measurement, no cost
Measurements
𝒀 𝒕 = 𝑩𝑿 𝒕 + 𝑵 𝟎, 𝑹(𝒂 𝒕)
History
𝐻𝑡 = (𝑎1, 𝑎2, … , 𝑎 𝑡−1, 𝒀 𝟏, 𝒀 𝟐, … , 𝒀𝒕−𝟏)
Policy
𝑎 𝑡 = 𝜋(𝐻𝑡)
Forecasts
𝑿 𝒕 = 𝐄 𝑿 𝒕 𝐻𝑡] given by the Kalman filter
Objective
min
𝜋
𝐄 σ 𝑡=1
∞
𝛾 𝑡 𝑿 𝒕 − 𝑿 𝒕
2
+ 𝑐 𝑎 𝑡
time
23. 3. Sampling
Problem P1: Examples
Parking
time-series occupancy of a block face
measurements from mobile cameras and payment data
Military
time-series position of a submarine
measurements by sonar
Telecommunications
time-series position of a handset
measurements with 5G antenna
24. 3. Sampling
Problem P1: Related Work
P1 addresses the basic machine learning trade-off between
• the cost of data acquisition
• the cost of errors due to a lack of data
in a particularly simple way
If we solve P1, then we also solve
• “the LQG control problem with costly measurements”(Meier et al, 1967)
The continuous-time version of this problem was solved only recently (Le Ny et al, 2011)
Niño-Mora and Villar (2009) conjectured that an optimal policy for P1 is a threshold policy
• i.e. measure if and only if the posterior variance exceeds a threshold.
Meier et al, Optimal control of measurement subsystems. IEEE TAC, 1967
Le Ny et al, Scheduling continuous-time Kalman filters. IEEE TAC, 2011
Niño-Mora and Villar, Multi-target tracking via restless bandit marginal productivity indices. IEEE CDC, 2009
25. 3. Sampling
Attention Mechanism Problem
Street 1
Street 2
Street 3
observation times
Given
• 𝑛 time-series to track, as in P1
• with 𝑚 sensors, where 𝑚 < 𝑛
Question
Which time-series should you measure at each time, so
as to minimise the total prediction error?
Discussion
• This problem has state space ℝ 𝑛
and 𝑛
𝑚
actions!
• Nevertheless, Whittle (1988) proposed a
computationally-efficient policy for this problem for
large 𝑚, 𝑛
• But to compute that policy, we must first solve P1
Example
• 4 cameras observing 800 streets in
Washington DC
• This was our original motivation for
this work
Whittle, Restless bandits: activity allocation in a changing world. J. App. Prob., 1988
26. 3. Sampling
Attention Mechanism Problem
Claim. Assuming P1 is solved, Whittle’s policy does much better than other heuristics
Example. 10 time-series, 1 sensor, weights on predictive variance 𝑤1 = 40, 𝑤2:10 = 1
time-series
time, 𝑡
colour = weighted
prediction error
Myopic policy
Observes the time-series
with the largest weighted
predictive variance
(often used in radar tracking)
Round-robin policy
Observes time-series 1, 2,
…, 10, 1, 2, …, 10, …
Whittle’s policy
27. 3. Sampling
Solution to P1
Dance and Silander, When are Kalman-Filter Restless Bandits Indexable? NIPS, 2015
Dance and Silander, Optimal Policies for Observing Time Series, in review, JMLR, 2017 (see arXiv)
Theorem (Dance and Silander, 2017)
1. A threshold policy is optimal for Problem P1
2. This result holds for many cost functions
minimum predictive variance,
minimum predictive entropy,
maximum predictive precision, …
3. There is a simple polynomial-time algorithm for approximating the threshold
28. 3. Sampling
Key to the proof
Examine the behaviour of the system under a
threshold policy
• The state 𝑥𝑡 is given by the predictive variance
• Its dynamics are given by the Kalman filter
variance updates, which are nonlinear
𝜙0 - no measurement (below threshold)
𝜙1 - measurement made (above threshold)
• So, for threshold 𝑧 we have
𝑥𝑡+1 = 𝑓 𝑥𝑡; 𝑧 ≔ ቊ
𝜙0(𝑥𝑡), 𝑥𝑡 < 𝑧
𝜙1(𝑥𝑡), 𝑥𝑡 ≥ 𝑧
action 0 action 1
threshold
29. 3. Sampling
Insights: Maps-with-Gaps
The behaviour of iterated function systems
𝑥𝑡+1 = 𝑓(𝑥𝑡)
has been extensively studied when 𝑓 is smooth
But our 𝑓(⋅; 𝑧) is discontinuous
Such maps-with-gaps are also important as models of:
• switching in electrical circuits
• neural spiking behaviour
• gene regulatory networks, …
30. 3. Sampling
Insights: Words
How does the action sequence generated by our
map change
• as we vary the threshold 𝑧
• from initial state 𝑥1 = 𝑧?
The action sequence is an infinite word on the
alphabet {0,1}
Question
What types of word does our map generate?
time,𝑡
threshold, 𝑧
black = action 0
white = action 1
31. 3. Sampling
Answer: Mechanical Words
Definition
A mechanical word is an infinite binary string whose 𝑛th letter is
𝑤 𝑛 = 𝑎(𝑛 + 1) − 𝑎𝑛 for some 𝑎 in [0,1].
Examples
𝑎 = 1/2 ⇒ the word 01 01 01 01…
𝑎 = 3/7 ⇒ the word 00100101 00100101 00100101…
Mechanical words correspond to the slopes of digital straight lines
Relation to the literature
• Kozyakin (2003) found general conditions under which nonlinear maps-with-gaps generate
mechanical words
• However, the relationship between the choice of threshold and the word generated was only
discovered for linear maps-with-gaps by Rajpathak et al (2012)
• Our work extends this threshold-to-word relationship to nonlinear maps
0 0
0 0
0
1
1
1
Kozyakin, Sturmian sequences generated by order-preserving circle maps. Inst. Information Trans., RAS, 2003
Rajpathak et al, Analysis of stable periodic orbits in the one-dimensional linear discontinuous map. Chaos, 2012
32. 3. Sampling
Open Questions
• What happens if there are more than 2 types of measurements?
• What can be said in the multivariate Gaussian case?
• What about non-Gaussian time-series?
34. Outlook
1. Widespread adoption of demand-management technologies makes sense
2. Counting cars with computer vision seems most cost-effective
⇒ room for improvement in accuracy of on-street car counts
3. Forecasting non-demarcated parking remains challenging
4. Parking policy and guidance for autonomous vehicles
⇒ enable more effective mechanisms:
routing policies, reservations, options, lotteries, …
⇒ while making life simpler for citizens