Locating bugs without looking back

Tezcan Dilshener,Michel Wermelinger,Yijun Yu

doi:10.1007/s10515-017-0226-1

Tezcan Dilshener, Michel Wermelinger + Show 1 more

Open Access

https://doi.org/10.1007/s10515-017-0226-1

Copy DOI

Journal: Automated software engineering	Publication Date: Oct 10, 2017
Citations: 11	License type: open-access

Affiliation: The Open University

Abstract

Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code? Information retrieval (IR) approaches see the bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the code. However, current state-of-the-art IR approaches rely on project history, in particular previously fixed bugs or previous versions of the source code. We present a novel approach that directly scores each current file against the given report, thus not requiring past code and reports. The scoring method is based on heuristics identified through manual inspection of a small sample of bug reports. We compare our approach to eight others, using their own five metrics on their own six open source projects. Out of 30 performance indicators, we improve 27 and equal 2. Over the projects analysed, on average we find one or more affected files in the top 10 ranked files for 76% of the bug reports. These results show the applicability of our approach to software projects without history.

Highlights

Current software applications are valuable strategic assets to companies: they play a central role for the business and require continued maintenance
After evaluating the approach with only 155 bug reports containing a stack trace, the results reveal that 65% of the bugs were fixed in a file mentioned in the ST submitted in the BR
The approach is tested on two open source software (OSS) projects (Tomcat and Birt) from the LtR dataset (Ye et al 2014) and the results reveal that Distributed REpresentation of Words based Bug Localization (DrewBL) by itself performs substantially worse than the combined approach, which is still worse than LtR

Summary

Introduction

Current software applications are valuable strategic assets to companies: they play a central role for the business and require continued maintenance. When certain components of the application do not perform according to their predefined functionality, they are classified to be in error. These unexpected and unintended erroneous behaviours, referred as bugs, are known to be often the product of coding mistakes. Upon discovering such abnormal behaviour of the software, a developer or a user reports it in a document referred as bug report (BR). BR documents may provide information that could help in fixing the bug by changing the relevant program elements of the application. The change request is expressed as a BR and the end goal is to change the existing program elements (e.g. source code files) to correct an undesired behaviour of the software

Objectives

Methods

Results

Discussion

Conclusion