With the rapid advancement of network technology and the expanding scale of the Internet, network traffic data has grown exponentially, leading to increasingly prominent issues in network security. Many researchers are developing new network traffic anomaly detection models, but these models are difficult to effectively capture the multi-scale temporal characteristics of network traffic data and learn the correlation and importance among various feature dimensions. To this end, we propose a multi-scale temporal feature network (MSTFN-AM) based on an attention mechanism. MSTFN-AM integrates original data with temporal information through the temporal position encoding module and extracts multi-scale temporal features through the temporal feature extraction module. The temporal self-attention module can effectively identify dependencies and correlations between different time points. Ultimately, experiments conducted on two public datasets demonstrate that the MSTFN-AM model outperforms all baseline models in prediction performance.