Abstract
The use of traffic conflict-based models to estimate crash risk and evaluate the safety of road locations is a popular direction for road safety analysis. However, a challenging issue of traffic conflict-based crash risk modeling is the selection of an appropriate sample size. Reliable conflict-based crash risk models typically require a large sample size which is always very difficult to collect. Further, when choosing a sample size, the bias-variance trade-off of model estimation is a constant concern. This study proposes an approach for identifying an adequate sample size for conflict-based crash risk estimation models. The appropriate sample size is determined by checking the model convergence and its goodness-of-fit. A quantitative approach for objectively testing the model goodness-of-fit is developed. Both the trace plots and the variation tendencies of Brooks-Gelman-Rubin statistics of parameter simulation chains are examined to inspect the model convergence. A graphical method is also used to check the model goodness of fit. If the model has not converged or fits poorly, then additional samples are required. The proposed method was applied to identify the adequate sample size for a Bayesian hierarchical extreme value theory (EVT) block maxima (BM) model using traffic conflict data from four signalized intersections in the city of Surrey, British Columbia. The indicator, modified time to collision (MTTC), was used to delineate traffic conflicts. A series of stationary and non-stationary Bayesian hierarchical BM models were developed using the cycle-level maximums of negated MTTC. The adequate sample sizes of stationary and non-stationary Bayesian hierarchical BM models were determined separately. Further, two methods of increasing the sample size (i.e., extending the observation period and combining data from different sites) were compared in terms of goodness-of-fit as well as crash estimate accuracy and precision. The results show that for both stationary and non-stationary models, the sample size used is adequate for model convergence and goodness-of-fit. Moreover, adding covariates to the stationary Bayesian hierarchical BM model does not affect the size of the required sample. Extending the observation period outperforms combining data from different sites in terms of goodness-of-fit as well as crash estimation accuracy and precision of non-stationary models. This is likely related to the existence of unmeasured factors that could impair model estimation and inference when merging data from several sites to augment the number of samples. Overall, the findings of this study can be applied to examine whether available data is adequate and the amount of additional data required for producing reliable statistical inference.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.