Acoustic-based diagnosis (ABD) is a promising method for rotating machinery fault detection in real-industrial fields due to its advantage of non-contact measurement by air-couple. However, most ABD methods, which rely on single function module of denoising or anti-noise diagnosis, lacks sufficient ability against the strong and highly non-stationary background noise interference in practical industrial application. To deal with the obstacle, a novel two-stage ABD system, which performs denoising and anti-noise diagnosis in a step-by-step fashion, is proposed in this paper. In our method, the denoising sub-system can adaptively track and suppress real-industrial background noise from collected multi-channel signal by our designed recursive multi-head self-attention (RMHSA) mechanism, which take into account the global and long-range information of signal to recursively exclude noise component at each position in signal sequence for overcoming the limitation of traditional method in balancing denoising performance and computational complexity and discontinuity in diagnosis procedure. Subsequently, a novel anti-noise method is explored to automatically track the residual noise components after initial suppression and estimating the noise interference probability within time-frequency (T-F) unit of each signal sample by our designed spatial multi-head self-attention (SMHSA) mechanism in recursive manner. Simultaneously, the anti-noise protocol interacts with diagnosis model based on estimated probability in anti-noise diagnosis sub-system to further improve the diagnosis performance of ABD system under the interference of background noise. Experiment result in both real-industrial noise condition and simulated noise conditions with extremely worse SNRs indicate that the proposed two-stage ABD system is effective in dealing with gear fault diagnosis task under noise condition.