Fault diagnosis plays an important role in improving the safety and reliability of complex equipment. Convolutional neural networks (CNN) have been widely used to diagnose faults due to their powerful feature extraction and learning capabilities. In practical industrial applications, the obtained signals always are disturbed by strong and highly non-stationary noise, so the timing relationships of the signals should be highlighted more. However, most CNN-based fault diagnosis methods directly use a pooling layer, which may corrupt the timing relationship of the signals easily. More importantly, due to a lack of an attention mechanism, it is difficult to extract deep informative features from noisy signals. To solve the shortcomings, an intelligent fault diagnosis method is proposed in this paper by using an improved convolutional neural network (ICNN) model. Three innovations are developed. Firstly, the receptive field is used as a guideline to design diagnosis network structures, and the receptive field of the last layer is close to the length of the original signal, which can enable the network to fully learn each sample. Secondly, the dilated convolution is adopted instead of standard convolution to obtain larger-scale information and preserves the internal structure and temporal relation of the signal when performing down-sampling. Thirdly, an attention mechanism block named advanced convolution and channel calibration (ACCC) is presented to calibrate the feature channels, thus the deep informative features are distributed in larger weights while noise-related features are effectively suppressed. Finally, two experiments show the ICNN-based fault diagnosis method can not only process strong noise signals but also diagnose early and minor faults. Compared with other methods, it achieves the highest average accuracy at 94.78% and 90.26%, which are 6.53% and 7.70% higher than the CNN methods, respectively. In complex machine bearing failure conditions, this method can be used to better diagnose the type of failure; in voice calls, this method can be used to better distinguish between voice and noisy background sounds to improve call quality.