In the complex and harsh environment of agriculture, rolling bearings, as the key transmission components in agricultural machinery, are very prone to failure, so research on the intelligent fault diagnosis of agricultural machinery components is critical. Therefore, this paper proposes a new method based on SVD-EDS-GST and ResNet-Vision Transformer (ResViT) for the fault diagnosis of rolling bearings in agricultural machines. Firstly, an experimental platform for rolling bearing failure in agricultural machinery is built, and one-dimensional vibration signals are obtained using acceleration sensors. Next, the signal is preprocessed for noise reduction using singular value decomposition (SVD) combined with the energy difference spectrum (EDS) to solve for the interference of complex noise and redundant components in the vibration signal. Secondly, generalized S-transform (GST) is used to process vibration signals into images. Then, the ResViT model is proposed, where the ResNet34 network is used to replace the image chunking mechanism in the original Vision Transformer model for feature extraction. Finally, an improved Vision Transformer (ViT) is utilized to synthesize global and local information for fault classification. The experimental results show that the proposed method’s average accuracy in rolling bearing fault classification for agricultural machinery reaches 99.08%. In addition, compared with SVD-EDS-GST-CNN, SVD-EDS-GST-LSTM, STFT-ViT, GST-ViT, and SVD-EDS-GST-ViT, the accuracy rate was improved by 3.5%, 3.84%, 4.8%, 8.02%, and 0.56%, and the standard deviation was also minimized.