The selection of the target variable is critical for the performance of classical car-following (CF) models. There is a vast body of literature on which target variable is optimal for classical CF models, but there is no study that evaluates the selection of optimal target variables for black-box models prevalent in machine learning, which are increasingly being used to model CF behavior. This paper evaluates different target variables, for example acceleration, velocity, and headway, for three models: Gaussian processes (GP), kernel ridge regression, and long short-term memory (LSTM). These models have different objective functions and work in different vector spaces, for example, GP work in function space, and LSTM in parameter space. Furthermore, this paper evaluates the efficacy of black box and classical CF models across datasets, including diverse traffic scenarios and human versus automated control. Finally, this paper evaluates the coupling of design choices across selection of model type, target variable during model calibration, and dataset characteristics. The experiments show that the optimal target variable recommendations for black-box models differ from classical CF models, depending on the objective function and the vector space. Statistical analysis indicates that selection of models couples greatly with target variables, while recommended target variables do not depend on the datasets used during model calibration. These results provide a path toward more robust CF models, potentially resulting in more accurate and generalizable capacity and/or traffic flow estimates for roadways, even as driving behaviors change with the introduction of new types of automation.
Read full abstract