Predicting the future states of surrounding traffic participants and planning a safe, smooth, and socially compliant trajectory accordingly are crucial for autonomous vehicles (AVs). There are two major issues with the current autonomous driving system: the prediction module is often separated from the planning module, and the cost function for planning is hard to specify and tune. To tackle these issues, we propose a differentiable integrated prediction and planning (DIPP) framework that can also learn the cost function from data. Specifically, our framework uses a differentiable nonlinear optimizer as the motion planner, which takes as input the predicted trajectories of surrounding agents given by the neural network and optimizes the trajectory for the AV, enabling all operations to be differentiable, including the cost function weights. The proposed framework is trained on a large-scale real-world driving dataset to imitate human driving trajectories in the entire driving scene and validated in both open-loop and closed-loop manners. The open-loop testing results reveal that the proposed method outperforms the baseline methods across a variety of metrics and delivers planning-centric prediction results, allowing the planning module to output trajectories close to those of human drivers. In closed-loop testing, the proposed method outperforms various baseline methods, showing the ability to handle complex urban driving scenarios and robustness against the distributional shift. Importantly, we find that joint training of planning and prediction modules achieves better performance than planning with a separate trained prediction module in both open-loop and closed-loop tests. Moreover, the ablation study indicates that the learnable components in the framework are essential to ensure planning stability and performance. Code and Supplementary Videos are available at https://mczhi.github.io/DIPP/.
Read full abstract