Promoting Explainability in Data-Driven Models for Anomaly Detection: A Step Toward Diagnosis

Quentin Dollon,Paul Labbé,François Léonard

doi:10.36001/phmconf.2023.v15i1.3509

Abstract

Anomaly detection has become a critical task in industry. Data-driven models are often used for anomaly detection due to their ability to learn patterns from data and identify behaviors that deviate from the learned patterns. Furthermore, they are simple to implement as they do not rely on complex physical models to make predictions. However, one major limitation of these models is their lack of explainability, which hinders the diagnosis of detected anomalies. Explainability provides transparency and interpretability, allowing stakeholders to understand the reasons for the detected deviation. In the absence of explainability, it is challenging to determine why a particular instance was classified as abnormal. Without an understanding of the underlying reason for the anomaly, it becomes difficult to prescribe a reliable diagnostic. This can result in missed opportunities for preventing or mitigating damage caused by the anomaly. Explainability can also help in detecting false positives and false negatives, especially, to distinguish between abnormal behaviors and sensor failures. Hydro-Quebec is the principal actor in electricity management in Quebec, Canada. The overwhelming majority of the production comes from hydroelectric generating units. Power grid sustainability then strongly depends on the efficient health supervision of these assets. In this study, we introduce a data-driven semi-supervised algorithm for anomaly detection, with emphasis on statistical explainability. This feature needs to be distinguished from the traditional explainable models, that build upon physics to interpret observations. Here, the purpose is to track the sources of deviations through statistics. This model does not belong to diagnosis tools, because its sole output is not sufficient to find the root causes of a problem. However, it makes a bridge toward such tools by providing clues about origin of failures. The algorithm performs in two-stages. First a model is trained to learn the normal behavior of the generating unit for a given set of operating conditions. This part involves clustering for data reduction and kriging for regression. Second, it compares the multidimensional prediction with the actual realization. It quantifies the deviation of the asset to its expected behavior and provides an explainable indicator for anomaly detection. After introducing the background foundations of the method, some examples are given that demonstrate the advantage of interpretability for support to operation and diagnosis. It will be shown how such an algorithm can be deployed in an operational environment and how it should be combined with other tools to improve assets health management.

Full Text