9. Trip edge Trip edge Middle trip Middle trip 85,190 trips ranging between 10 - 90 min. Middle trip and trip edges The Trip
10. Fitting a model Poisson model doesn’t work (Mean < Variance) And are random variables: Variables: Fitting Negative binomial model with Maximum Likelihood
Good After noon, Today I’m going to talk about the occurrence of undesired driving events. My name is Oren Musicant and working with me on this project are Professor Edna Schechtman and Dr. Hillel Bar - Gera.
Undesired driving events are events where the driver exceeds a certain threshold of speed or acceleration. for example sudden braking and lane changing, Excessive acceleration, sharp turning, and so on.
Ti identify the accordance of driving events we used in-vehicle data recorder named the Green-Box. The green-box manufactured by GreenRoad Technologies is able to identify driving events. It consists with Sensors for acceleration and speed and a processing unit. The information is transmitted in real-time to a server that analyzes it and generates a driver profile. A report generator provides the driver with feedback via text messaging, e-mail or web-based reporting. Real time In-vehicle feedback is also available.
So, lets me briefly walk you through the type of information that this technology can generate. In this web report we can see the days of the month along the X axis and the number of trips taken that day along the Y axis, with each square representing a specific trip. The trips are color coded by the trip safety level. A green trip has only a few events, and a red trip consists with many events. The cumulated data is then used to generate risk indices for the specific driver.
We know that the Green-Box information is correlated to the risk of crash involvement. and we also know that the information in very detailed, and objective. But we don’t know how often undesired driving events occur, when and by whom. We also don’t know what kind of statistical tools can be applied to analyze the occurrence of such events.
To answer these questions, we analyzed information of more then 100 thousand trips accumulated in 6 months for more then 100 drivers. For each trip we know the start and end time, the number of events and the driver gender.
first we evaluate on whether the events frequency is approximately the same for every trip duration. We expected the answer to be yes. but in reality for short trips the events frequency is higher and as the duration becomes longer the trend is leveled. now the question is – why? Why are we seeing this behavior in events frequency?
The answer is in this figure. The figure shows the events frequency at each minute from the beginning of the trip for trips, with duration of 10 minutes. This graph shows the same information for trips with duration of 15 minutes. We can see that in the beginning and ending the frequency is higher. This phenomenon was repeated for other trips durations as well. So we have a constant event count in the trip edges and when we divide this constant by the duration we receive this shape.
We decided to part the trips into 2 segments. Trip edges and middle of the trip. Here you can see the events frequency at the first and last 5 minutes in over 80 thousand trips. Based on this information we decided that the edges will includes 2 minutes from trip beginning and 3 minutes from trip ending. Interestingly the last minute has unexpected events frequency. this is probably caused by a few seconds late in reporting the trip ending.
Next we undertook to fit a theoretical model for the events frequency. Since it is a count data we first consider the Poisson distribution. yet for every given trip duration the mean count of events is lower then the variance. This led to the choice of the negative-binomial model. We used the Maximum Likelihood method to fit the negative-binomial model for the events count in the trip edges and middle of the trip.
So lets see how good is the model fit in the middle of trip. The upper left graph shows the probability in percentage for 0 events for each trip duration. the blue line shows the observed probability and the black lines indicate the model acceptance region. The figure on the upper right shows the probability of one event. And bellow you can see the probability of 2 and 3 events. So you can judge for your self were the negative binomial-model has a good fit and when the fit is not so good.
Now, for the trip edges we used the Chi square test to evaluate the model fit. The figure shows for each cell the observed and expected frequency . There is a remarkable resemblance. yet the formal test rejected the assumption that the observed distribution is actually Negative-Binomial. Yet, with such a large sample the assumption can be very easily rejected.
Next we locked at the events frequency at each time of day. As you can see night time has more events then day time. The partition of the day for 2 segments seems suitable here. And we chose the partition you see because it maximized the likelihood function.
Differences in events frequency between the days of week seemed not meaningful in comparison to the differences in the previous figure so we didn’t consider it as an important variable to be included in the model.
In this figure you can see the result of a negative-binomial regression. Driver Gender and Time of day are the explanatory variables for events frequency. In both trip segments, the interaction between gender and time of the day is significant. when moving from day to night Males' Events frequency increased more prominently than females‘.
To summaries: Prior to this work we knew that events frequency is a surrogate for safety and that the green-box information is detailed and objective. now we know that there are differences in Events Frequency between trip segments , males and females trips , and time of day. We also demonstrated how well the distribution of events frequency is similar to negative-binomial.
I would like to thank our sponsors from RAN-NAOR foundation and Paul Ivanier center and would also like to thank green-road technologies for enabling unlimited direct access to the Green-Box information.