AbstractPredicting and understanding travellers’ mode choices is crucial to developing urban transportation systems and formulating traffic demand management strategies. Machine learning (ML) methods have been widely used as promising alternatives to traditional discrete choice models owing to their high prediction accuracy. However, a significant body of ML methods, especially the branch of neural networks, is constrained by overfitting and a lack of model interpretability. This study employs a neural network with feature selection for predicting travel mode choices and Shapley additive explanations (SHAP) analysis for model interpretation. A dataset collected in Chengdu, China was used for experimentation. The results reveal that the neural network achieves commendable prediction performance, with a 12% improvement over the traditional multinomial logit model. Also, feature selection using a combined result from two embedded methods can alleviate the overfitting tendency of the neural network, while establishing a more robust model against redundant or unnecessary variables. Additionally, the SHAP analysis identifies factors such as travel expenditure, age, driving experience, number of cars owned, individual monthly income, and trip purpose as significant features in our dataset. The heterogeneity of mode choice behaviour is significant among demographic groups, including different age, car ownership, and income levels.
Read full abstract