Abstract The development of high-performance fault diagnosis models for specific tasks requires substantial expertise. Neural Architecture Search (NAS) offers a promising solution, but most NAS methodologies are hampered by lengthy search durations and low efficiency, and few researchers have applied these methods within the fault diagnosis domain. This paper introduces a novel differentiable architecture search (DARTS) method tailored for constructing efficient fault diagnosis models for rotating machinery, designed to rapidly and effectively search for network models suitable for specific datasets. Specifically, this study constructs a completely new and advanced search space, incorporating various efficient, lightweight convolutional operations to reduce computational complexity. To enhance the stability of the differentiable network architecture search process and reduce fluctuations in model accuracy, this study proposes a novel Multi-scale Pyramid Squeeze Attention (MPSA) module. This module aids in the learning of richer multi-scale feature representations and adaptively recalibrates the weights of multi-dimensional channel attention. The proposed method was validated on two rotating machinery fault datasets, demonstrating superior performance compared to manually designed networks and general network search methods, with notably improved diagnostic effectiveness.