The existence of internal and external heterogeneity has been established by numerous studies across various fields, including transportation and safety analysis. The findings from these studies underscore the complexity of crash data and the multifaceted nature of risk factors involved in accidents. However, most studies consider the effects of unobserved heterogeneity from one perspective −- either within clusters (internal) or between clusters (external) −- and do not investigate the biases from both simultaneously on crash frequency analysis. To fill this gap, this study proposes a hybrid approach combining latent class cluster analysis with the random parameter negative binomial regression model (LCA-RPNB) to explore the association between risk factors and bicycle crash frequency. First, the bicycle crash data is categorized into three clusters using LCA based on crash features such as gender, trip purposes, weather, and light conditions. Then, the separated crash frequency models for different clusters and the overall model are developed based on RPNB using regional factors of crash locations as independent variables and the crash frequency of different clusters respectively as dependent variables. The hybrid approach enables a comprehensive examination of internal and external heterogeneities among bicycle crash frequency factors simultaneously. Results suggest that the proposed hybrid approach exhibits superior fitting and predictive performance compared to the model only considers the effects of unobserved heterogeneity from one perspective with the lower values of Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). This approach can help policymakers and urban planners to design more effective safety interventions by understanding the distinct needs of different bicyclist clusters and the specific factors that contribute to crash risk in each group.
Read full abstract