Abstract Recent years have witnessed a surge in the development of intelligent fault diagnosis (IFD) mostly based on deep learning methods, offering increasingly accurate and autonomous solutions. However, they overlook the interpretability of models, and most models are black-box models with unclear internal mechanisms, thereby reducing users’ confidence in the decision-making process. This is particularly problematic for critical decisions, as a lack of clarity regarding the diagnostic rationale poses substantial risks. To address these challenges, a more reliable, transparent, and interpretable system is urgently demanded. Research on the interpretability of IFD has gained momentum and stands today as a vibrant area of study. To promote in-depth research and advance the development of this field, a thorough examination of existing journal articles on interpretable fault diagnosis models is essential. Such a review will demystify current technologies for readers and provide a foundation for future investigation. This article aims to give a systematic review of the state-of-the-art interpretability research in the field of IFD. We present a systematic review of recent scholarly work on interpretable models in this domain, categorizing them according to their methodologies and structural attributes. In addition, we discuss the challenges and future research directions for the interpretability of IFD.
Read full abstract