Abstract

Behaviorally oriented activity-travel choices (ATC) modeling is a principal part of travel demand analysis. Traditional econometric and rule-based methods require explicit model structures and complex domain knowledge. While several recent studies used machine learning models, especially adversarial inverse reinforcement learning (IRL) models, to learn potential ATC patterns with less expert-designed settings, they lack a clear representation of rational ATC behavior. In this study, we propose a data-driven IRL framework based on the maximum causal approach to minimize f-divergences between expert and agent state marginal distributions, which provides a more sample-efficient measurement. In addition, we specify a separate state-only reward function and derive an analytical gradient of the f-divergence objective with respect to reward parameters to ensure good convergences. The method can recover a stationary reward function, which assures the agent to get close to the expert behavior when training from scratch. We validate the proposed model using cellular signaling data from Chongqing, China by comparing with baseline models (behavior cloning, policy-based, and reward-based models) in aspects of policy performance comparison, reward recovery, and reward transfer tasks. The experiment results indicate that the proposed model outperforms existing methods and is relatively less sensitive to the number of expert demonstrations. Qualitative analyses are provided on the fundamental ATC preferences on different features given the reward function recovered from the observed mobility trajectories, and on the learning behaviors under different choices of f-divergence.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call