Abstract

Defect prediction models are proposed to help a team prioritize the areas of source code files that need Software Quality Assurance (SQA) based on the likelihood of having defects. However, developers may waste their unnecessary effort on the whole file while only a small fraction of its source code lines are defective. Indeed, we find that as little as 1-3 percent of lines of a file are defective. Hence, in this work, we propose a novel framework (called <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> ) to identify defective lines using a model-agnostic technique, i.e., an Explainable AI technique that provides information why the model makes such a prediction. Broadly speaking, our <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> first builds a file-level defect model using code token features. Then, our <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> uses a state-of-the-art model-agnostic technique (i.e., LIME) to identify risky tokens, i.e., code tokens that lead the file-level defect model to predict that the file will be defective. Then, the lines that contain risky tokens are predicted as defective lines. Through a case study of 32 releases of nine Java open source systems, our evaluation results show that our <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> achieves an average recall of 0.61, a false alarm rate of 0.47, a top 20%LOC recall of 0.27, and an initial false alarm of 16, which are statistically better than six baseline approaches. Our evaluation shows that our <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> requires an average computation time of 10 seconds including model construction and defective line identification time. In addition, we find that 63 percent of defective lines that can be identified by our <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> are related to common defects (e.g., argument change, condition change). These results suggest that our <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> can effectively identify defective lines that contain common defects while requiring a smaller amount of inspection effort and a manageable computation cost. The contribution of this paper builds an important step towards line-level defect prediction by leveraging a model-agnostic technique.

Highlights

  • S Oftware Quality Assurance (SQA) is one of software engineering practices for ensuring the quality of a software product [26]

  • We propose a novel line-level defect prediction framework which leverages a model-agnostic technique to predict defective lines, i.e., the source code lines that will be changed by bug-fixing commits to fix post-release defects

  • This result suggests that when comparing with the traditional approach of predicting defects at the file level, our LINE-DP could potentially help developers reduce Software Quality Assurance (SQA) effort that will be spent on 52% of clean lines, while 62% of defective lines will be examined

Read more

Summary

Introduction

S Oftware Quality Assurance (SQA) is one of software engineering practices for ensuring the quality of a software product [26]. When changed files from the cuttingedge development branches will be merged into the release branch where the quality is strictly controlled, an SQA team needs to carefully analyze and identify software defects in those changed files [1]. Defect prediction models are proposed to help SQA teams prioritize their effort by analyzing post-release software defects that occur in the previous release [16, 26, 55, 59, 77, 80]. Release preparation embedded as a quality culture throughout the life cycles from planning, development stage, to release preparation so teams can follow the best practices to prevent software defects.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.