Abstract

Context:Given the data-driven paradigm inherent to Deep Learning (DL), it is inevitable that DL software will exhibit incorrect behavior in real-world applications. DL programs have been identified as a primary source of DL faults. To tackle this, researchers have devised a unique framework that approaches fault diagnosis as a learning task, which leverages runtime data as metrics to construct predictive models, enabling effective fault diagnosis. Object:In this paper, we aim to propose new metrics, especially from the coverage view, to enhance the performance of fault diagnosis models. Method:We combine coverage criteria and statistical operators to propose 80 coverage metrics, which summarize the trend of coverage values in the model training procedure. We construct hybrid prediction models by combining our new coverage metrics and existing runtime metrics under four widely used classifiers. Results:To examine whether adding our new coverage metrics performs well in DL program fault diagnosis, we conduct our experiments on six widely used datasets under four indicators (i.e., accuracy, F1 score, AUC, and MCC). Through the experiments, we observe that (a) coverage metrics are not redundant with respect to the original runtime metrics, and (b) adding extra coverage metrics can significantly enhance the performance of fault diagnosis models. Conclusions:Our study shows that our proposed coverage metrics are helpful in constructing effective fault diagnosis models for DL programs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call