In the realm of Intelligent Transportation Systems (ITSs), traffic flow prediction is crucial for multiple applications. The primary challenge in traffic flow prediction lies in the handling and modeling of the intricate spatial–temporal correlations inherent in transport data. In recent years, many studies have focused on developing various Spatial–Temporal Graph Neural Networks (STGNNs), and researchers have also begun to explore the application of transformers to capture spatial–temporal correlations in traffic data. However, GNN-based methods mainly focus on modeling spatial correlations statically, which significantly limits their capacity to discover dynamic and long-range spatial patterns. Transformer-based methods have not sufficiently extracted the comprehensive representation of traffic data features. To explore dynamic spatial dependencies and comprehensively characterize traffic data, the Spatial–Temporal Fusion Embedding Transformer (STFEformer) is proposed for traffic flow prediction. Specifically, we propose a fusion embedding layer to capture and fuse both native information and spatial–temporal features, aiming to achieve a comprehensive representation of traffic data characteristics. Then, we introduce a spatial self-attention module designed to enhance detection of dynamic and long-range spatial correlations by focusing on interactions between similar nodes. Extensive experiments conducted on three real-world datasets demonstrate that STFEformer significantly outperforms various baseline models, notably achieving up to a 5.6% reduction in Mean Absolute Error (MAE) on the PeMS08 dataset compared to the next-best model. Furthermore, the results of ablation experiments and visualizations are employed to clarify and highlight our model’s performance. STFEformer represents a meaningful advancement in traffic flow prediction, potentially influencing future research and applications in ITSs by providing a more robust framework for managing and analyzing traffic data.
Read full abstract