Abstract

Prediction of mode choice for travelers has been the subject of keen interest among transportation planners. Traditionally, mode choice analysis is conducted by statistical models or simple machine learning (ML) paradigms. Although statistical analysis approaches have a good theoretical basis and interpretability, they are built on several unrealistic assumptions regarding the distribution of data, which may lead to biased model predictions. On the other hand, the ML methods widely used in this regard have poor interpretability and fail to capture the behavioral aspects. To fill this gap, this study proposes a systematic machine learning (ML) framework for a better understanding of traveler’s mode choice decisions. Five different ML models (Logistic Regression, Random Forests, Decision Tree, Multilayer Perceptron, Light Gradient Boosting Decision Tree (LightGBDT)) were developed to model the travel mode choices of travelers using three years of Dutch National Travel Survey data. Empirical results of various performance evaluation metrics (overall accuracy, average precision, precision-recall curves) showed that LightGBDT outperformed other models for both under and over-sampling strategies. To overcome the blackbox criticism of ML models and to improve their interpretability, variable importance and SHAP dependency analysis were also conducted. The analysis showed that predictors that significantly influence the travel mode decisions of travelers include trip distance, travelers’ age and annual income, number of cars/bicycles owned, and trip density. The results can be used for better understanding and effective modeling of travelers’ mode choice preferences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call