Abstract
This paper presents a study that evaluates the nature of the associations (i.e., linear or non-linear) between built environment variables and pedestrian crash frequency at the census block group level. A machine learning approach, called the componentwise model-based gradient boosting algorithm, was implemented to estimate the nature and effects of sociodemographic, land use, road network, and traffic attributes on pedestrian crashes from Broward and Miami-Dade Counties in Florida. The algorithm provides the flexibility to use different types of base-learners, including but not limited to decision tree (DT), generalized additive model (GAM), and Markov Random Field (MRF). While gradient boosting with DT base-learner has widely been used in safety studies, other base-learners and their performances in crash frequency predictions are yet to be explored. This study compared the performance of DT and GAM base-learners, with an MRF base-learner to account for spatial correlation among analysis units. Models fitted with GAM base-learner were found to perform better than the models fitted with DT base-learner, with several variables showing non-linear and several showing linear or approximately linear correlations with pedestrian crash frequency. The study provides useful insights on how the results can help urban planners and policy makers to optimize pedestrian safety measures.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have