Machine Learning and Optimization For Traffic and Emergency ...
1. Machine Learning and Optimization For Traffic and Emergency Resource Management. Milos Hauskrecht Department of Computer Science University of Pittsburgh Students: Branislav Kveton, Tomas Singliar UPitt collaborators: Louise Comfort, JS Lin External: Eli Upfal (Brown), Carlos Guestrin (CMU)
The road network of a major city like Pittsburgh is an incredibly complex thing, both in structure and behavior. Things happen there at random * an at many places at the same time. So if we want to model it, we must deal with complex spatial structure, involved interactions between traffic flows, high dimensionality of data. And what happens is of course strongly confounded by variables we do not see, such as weather and time. With this in mind, we undertake the study of Pittsburgh roads.
To successfully model a system behavior, we need to get these three things right. We want to be able to capture the general nature of how traffic interacts spatially. The sensor coverage is not all that great, especially since they break; also the sensors and cameras are not on the roads very densely. We can have a computer tell us what is the most likely picture of traffic where we do not see. Experts have intuitions from simulations , but nobody really knows what traffic d oes. For all this types of problems, the probabilistic paradigm of modeling provides a well understood and precise formulation, and often an algorithm.
Let us discuss for a while how the data is collected and what kind of structure it has. Pennsylvania Department of Transportation, PennDOT for short, operates about 200 sensors in Pittsburgh. They look like this * pole at the side of the road and measure speed and number of cars every 5 minutes. In this work we focus on car counts but our models directly generalize to other quantities such as speed. In the picture, the red dots are positions of the sensors.
Here we have a photo that we all recognize; it shows central Pittsburgh. I am going to draw the sensors as the bluish circles and in particular I am going to talk about the highlighted sensor in the middle. The lines connecting the sensors roughly correspond to major connectivity of the road network. Now, let us choose a direction traffic, say southbound, and look at two sensors upstream and downstream from our highlighted sensor. Certainly, if the traffic backs up at the downstream sensor *, it may back up all the way to the Point State Park sensor. On the other hand, if there is a lot of cars coming from the upstream sensor, the road certainly will not be empty here *. But you will agree with me that once you know the status in these two places, and any other places that feed traffic to our sensor, you know all there is to know about the surrounding situation. Right? It doesn’t matter if traffic is jammed here *; if it flow s freely here * , the traffic jam need not interest us as far as the highlighted sensor is concerned. In other words, the MARKOV PROPERTY holds, which implies the Markov Random Field is the correct model. These effects can become circular . The circularity that is at the heart of the trouble with markov Random Field computational complexity. So basically if you are going to hope for any simplification, you have to break the cycle s. That will be the leitmotif for the rest of the talk and I’ll show you three different models that do it.
Now people do all sorts of analyses when it comes to traffic. For example physicists look at phase transitions, which is really cool, since you certainly could say that traffic froze and it would be a good metaphor. But when it comes to modeling probabilities, typically the simplifying assumption is that everything is independent. [click] I told you why this is not the case in the previous slide and I’ll show you later how strong of an assumption it is. So it would be good to retain maximum dependencies possible, while maintaining that acyclicity property. What is a maximal acyclic graph? A tree, of course.
So here I drew one of the possible trees. We still lose some dependencies though and we would like to account for them. So we have another tree account for them, whose edges are orange color in this picture. Now nobody had developed a machine learning algorithm for this, because we are working with continuous variables. But Marina Meila has developed something similar for discrete variables. So we sat down and redid her derivations for the continuous case. And I’m going to show you what kind of tree structures we learn.
Now I will show you an alternative representation of the dependencies. We don’t like the links so we remove them and we introduce abstract hidden variables. They represent the links we had there indirectly. The good thing about them is that they don’t have to be just that, they may represent external variables such as the weather that would have links to all sensors, or the presence of a Steelers’ game that would have links to around here *. The mathematical name for the concrete model we are using is Mixture of Factor Analyzers (so it’s again a mixture model.)