To investigate pedestrian route choice mechanisms from a perspective distinct from that employed in discrete choice models (DCMs), this study utilizes machine learning models and employs SHapley Additive exPlanations (SHAP) for model interpretation. The data used in this paper come from several pedestrian flow experiments with two routes, which were recorded by UAV. Our findings indicate that logistic regression (similar to a binary logit model) exhibits good computational efficiency but falls short in predictive accuracy when compared to other machine learning models. Among the 12 machine learning models assessed, by calculating the new indicator named OP, we find that eXtreme Gradient Boosting (XGB) and Light Gradient Boosting (LGB) strike the best balance between accuracy and computational efficiency. Regarding feature contribution, our analysis reveals that bottlenecks exert the most significant influence on pedestrian route choice behavior, followed by the time it takes pedestrians to return from the end of the route to the origin (reflecting pedestrian characteristics and attitudes). While the pedestrian density of the shorter route contributes less compared to bottlenecks and return time, it exhibits a threshold effect, meaning that once the density of the shorter route surpasses a certain threshold, most pedestrians opt for the longer route.
Read full abstract