Abstract

Fault localization is indeed tedious and costly work during software maintenance. Studies indicate that combining both structural features and behavior characteristics of programs can be beneficial for improving the efficiency of fault locating. In this paper, we proposed a framework, called Mulr4FL, for fault localization using a multivariate logistic regression model that combined both static and dynamic features collected from the program under debugging. Firstly, the hybrid metrics data set, with both program structural features and behavior characteristics combined, is constructed by static program analyzing and dynamically tracing that runs with a designed metrics set. Meanwhile, the fault information of the legacy program is also obtained from the bug tracking system. Secondly, Bivariate logistic analysis is performed to filter the metrics that are significantly related to faults, and then with the selected metrics and their measurements, a multivariate logistic regression model was constructed and trained. Finally, based on the trained logistic model, we conduct the multivariate logistic analysis on the features of the evolved software and predict the buggy class methods. An empirical study was conducted based on a set of benchmarks that are used widely in program debugging research. The results indicate that the Mulr4FL can significantly improve the effectiveness of locating faults in contrast to 5 baseline techniques.

Highlights

  • S OFTWARE is a complex artifact during which the life cycle often undergoes multiple version evolution due to the change of requirement or software operating environment

  • With the measured structural features and the detected fault details of the old version of the program, a bivariate logistic regression analysis is applied to select out the metrics that are significantly related to faults

  • We propose a novel framework Mulr4FL for locating source code faults based on multivariate logistic regression analysis

Read more

Summary

INTRODUCTION

S OFTWARE is a complex artifact during which the life cycle often undergoes multiple version evolution due to the change of requirement or software operating environment. We hypothesize that it will be beneficial to combine both measurements of static and dynamic information of the program, and to build an effective multivariate logistic model for fault localization the multivariate logistic regression model can be used to combine different behavior of the software to predict the fault-proneness entities and improve the performance of fault localization. We proposed an efficient fault localization framework (called Mulr4FL) using a multivariate logistic regression model in the maintenance of evolution software. With the measured structural features and the detected fault details of the old version of the program, a bivariate logistic regression analysis is applied to select out the metrics that are significantly related to faults. We proposed a three-stage framework (called Mulr4FL) for fault localization in the class methods level, which applies the multivariate logistic regression model.

PRELIMINARIES
MODEL CONSTRCUTING
EXPERIMENTAL STUDY
STUDIED PROJECTS
#Methods
StmtExe
THREATS TO VALIDITY
RELATED WORK
Findings
CONCLUSION AND FUTURE WORK
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call