Abstract Aero-engine rolling bearings are essential for engine health, in which disruptive failures can be prevented and reduce great losses in air flight. To improve the efficiency of fault detection, an improved network, named CNN- BiLSTM -Cross-Attention (CBLCA) was proposed. The Bidirectional Long Short-Term Memory (BiLSTM) layer captures the temporal features as the input data. The cross-attention mechanism is integrated with the Convolutional Neural Networks (CNN) layer and the BiLSTM layer respectively. More important feature information can be identified with the CBLCA model. The proposed model was also validated with the open-sourced aero-engine rolling bearings data set. To improve the identification accuracy, a novel method that combines Fast Fourier Transform (FFT) and Variational Mode Decomposition (VMD) is used for the data preprocessing. Each original signal sample is transformed into a feature set containing richer information, and the number of features significantly increased in the entire dataset. Compared with some existing LSTM models, such as LSTM, BiLSTM, CNN-BiLSTM, and CNN-LSTM, the classification accuracy was increased by 55%, 54%, 5%, and 7%, respectively. The processing method for vibration signals and the CBLCA model can improve the accuracy and reliability of fault diagnosis for aero-engine rolling bearings.