Fatigued driving is a leading cause of traffic accidents, and accurately predicting driver fatigue can significantly reduce their occurrence. However, modern fatigue detection models based on neural networks often face challenges such as poor interpretability and insufficient input feature dimensions. This article proposes a novel Spatial-Frequency-Temporal Network (SFT-Net) method for detecting driver fatigue using electroencephalogram (EEG) data. Our approach integrates EEG signals' spatial, frequency, and temporal information to improve recognition performance. We transform the differential entropy of five frequency bands of EEG signals into a 4D feature tensor to preserve these three types of information. An attention module is then used to recalibrate the spatial and frequency information of each input 4D feature tensor time slice. The output of this module is fed into a depthwise separable convolution (DSC) module, which extracts spatial and frequency features after attention fusion. Finally, long short-term memory (LSTM) is used to extract the temporal dependence of the sequence, and the final features are output through a linear layer. We validate the effectiveness of our model on the SEED-VIG dataset, and experimental results demonstrate that SFT-Net outperforms other popular models for EEG fatigue detection. Interpretability analysis supports the claim that our model has a certain level of interpretability. Our work addresses the challenge of detecting driver fatigue from EEG data and highlights the importance of integrating spatial, frequency, and temporal information.