Abstract
Identifying hazardous crash sites (or hotspots) is a crucial step in highway safety management. The Negative Binomial (NB) model is the most common model used in safety analyses and evaluations - including hotspot identification. The NB model, however, is not without limitations. In fact, this model does not perform well when data are highly dispersed, include excess zero observations, or have a long tail. Recently, the Negative Binomial-Lindley (NB-L) model has been proposed as an alternative to the NB. The NB-L model overcomes several limitations related to the NB, such as addressing the issue of excess zero observations in highly dispersed data. However, it is not clear how the NB-L model performs regarding the hotspot identification. In this paper, an innovative Monte Carlo simulation protocol was designed to generate a wide range of simulated data characterized by different means, dispersions, and percentage of zeros. Next, the NB-L model was written as a Full-Bayes hierarchical model and compared with the Full-Bayes NB model for hotspot identification using extensive simulation scenarios. Most previous studies focused on statistical fit, and showed that the NB-L model fits the data better than the NB. In this research, however, we investigated the performance of the NB-L model in identifying the hazardous sites. We showed that there is a trade-off between the NB-L and NB when it comes to hotspot identification. Multiple performance metrics were used for the assessment. Among those, the results show that the NB-L model provides a better specificity in identifying hotspots, while the NB model provides a better sensitivity, especially for highly dispersed data. In other words, while the NB model performs better in identifying hazardous sites, the NB-L model performs better, when budget is limited, by not selecting non-hazardous sites as hazardous.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have