The accurate detection of electrical equipment states and faults is crucial for the reliable operation of such equipment and for maintaining the health of the overall power system. The state of power equipment can be effectively monitored through deep learning-based visual inspection methods, which provide essential information for diagnosing and predicting equipment failures. However, there are significant challenges: on the one hand, electrical equipment typically operates in complex environments, thus resulting in captured images that contain environmental noise, which significantly reduces the accuracy of state recognition based on visual perception. This, in turn, affects the comprehensiveness of the power system's situational awareness. On the other hand, visual perception is limited to obtaining the appearance characteristics of the equipment. The lack of logical reasoning makes it difficult for purely visual analysis to conduct a deeper analysis and diagnosis of the complex equipment state. Therefore, to address these two issues, we first designed an image super-resolution reconstruction method based on the Generative Adversarial Network (GAN) to filter environmental noise. Then, the pixel information is analyzed using a deep learning-based method to obtain the spatial feature of the equipment. Finally, by constructing the logic diagram for electrical equipment clusters, we propose an interpretable fault diagnosis method that integrates the spatial features and temporal states of the electrical equipment. To verify the effectiveness of the proposed algorithm, extensive experiments are conducted on six datasets. The results demonstrate that the proposed method can achieve high accuracy in diagnosing electrical equipment faults.