The prediction of road user behaviors in the context of autonomous driving has attracted considerable attention from the scientific community in recent years. Most works focus on predicting behaviors based on kinematic information alone, a simplification of reality since road users are humans, and as such they are highly influenced by their surrounding context. In addition, a large plethora of research works rely on powerful Deep Learning techniques, which exhibit high-performance metrics in prediction tasks but may lack the ability to fully understand and exploit the contextual semantic information contained in the road scene, not to mention their inability to provide explainable predictions that can be understood by humans. In this work, we propose an explainable road users’ behavior prediction system that integrates the reasoning abilities of Knowledge Graphs (KG) and the expressiveness capabilities of Large Language Models (LLM) by using Retrieval Augmented Generation (RAG) techniques. For that purpose, Knowledge Graph Embeddings (KGE) and Bayesian inference are combined to allow the deployment of a fully inductive reasoning system that enables the issuing of predictions that rely on legacy information contained in the graph, as well as on current evidence gathered in real-time by onboard sensors. Two use cases have been implemented following the proposed approach: 1) Prediction of pedestrians’ crossing actions; and 2) Prediction of lane change maneuvers. In both cases, the performance attained exceeds the current state-of-the-art in terms of anticipation and F1 score, showing a promising avenue for future research in this field.
Read full abstract