Bearing fault diagnosis for equipment-safe operation has a crucial role. In recent years, more achievements have been made in bearing fault diagnosis. However, for the fault diagnosis model, the representation and sensitivity of bearing fault features have a great influence on the diagnosis output results; thus, the attention mechanism is particularly important for the selection of features. However, global attention focuses on all sequences, which is computationally expensive and not ideal for fault diagnosis tasks. The local attention mechanism ignores the relationship between non-adjacent sequences. To address the respective shortcomings of global attention and local attention, an adaptive sparse attention network is proposed in this paper to filter fault-sensitive information by soft threshold filtering. In addition, the effects of different signal representation domains on fault diagnosis results are investigated to filter out signal representation forms with better performance. Finally, the proposed adaptive sparse attention network is applied to cross-working conditions diagnosis of bearings. The adaptive sparse attention mechanism focuses on the signal characteristics of different frequency bands for different fault types. The proposed network model achieves better overall performance when comparing the cross-conditions diagnosis accuracy and model convergence speed.