As an essential part of intelligent transportation system (ITS), traffic forecasting has provided crucial role for traffic management and risk assessment. However, complex spatial–temporal dependencies, heterogeneity, dynamicity, and periodicity of traffic data influence the traffic forecasting performance. Consequently, we propose a novel effective gated spatial–temporal merged transformer (GSTMT) inspired by multimask and dual branch for accurate traffic forecasting in this paper. Specifically, we first conduct a concatenation of gated spatial static mask transformer (GSSMT) and gated spatial dynamic mask transformer (GSDMT) with residual network. The GSSMT and GSDMT evolve from the traditional transformer by making preferable modifications that include gated linear unit (GLU), multimask mechanism including static mask matrix (SMM) and dynamic mask matrix (DMM), and spatial attention (SA). Among them, GLU is to promote the performance of capturing spatial dependency, dynamicity, and heterogeneity due to advanced performance for controlling information flow through layers. Additionally, by developing multimask mechanism including two novel SMM and DMM, the proposed GSTMT can precisely model the static and dynamic spatial structure for effectively highlighting static dependency and dynamicity. And SA is injected for enhancing the ability of capturing spatial dependency of GSSMT and GSDMT. Secondly, we develop a dual‐branch gated temporal transformer (DBGTT) for capturing temporal dependency, heterogeneity, dynamicity, and periodicity via incorporating the GLU and mixed time series decomposition (MTD) into traditional transformer. Similarly, we also introduce the GLU for empowering DBGTT with capability of capturing temporal dependency, dynamicity, and heterogeneity. In addition, MTD, which brings dual‐branch mechanism, can enhance the DBGTT for capturing more detailed temporal information via exploiting global and periodic profile of traffic data. At last, some experiments, which are performed on several real‐world traffic datasets, demonstrate the better results over classic traffic forecasting methods.
Read full abstract