In recent years, deep learning methods have been shown to have strong potential and superiority in reducing channel state information (CSI) feedback overhead and further improving feedback accuracy to maximize the performance benefits of massive Multiple-Input Multiple-Output (MIMO) in frequency division duplex (FDD) mode. As the CSI matrices are transformed into sequences for input to the Transformer model, the rearrangement leads to the loss of the original physical location relationships. Based on this problem, this paper proposes a transformer decoder based on spatio-temporal joint (ST-T). We employ a spatial attention mechanism to compensate for this information loss and focus on key spatial features more accurately, further exploiting the potential of single- and two-layer transformers in reconstructing CSI matrices. The results are validated by simulations based on DCRNet and CLNet encoders, which show that higher performance can be achieved with lower computational load compared to other lightweight models.
Read full abstract