Abstract
ContextSoftware systems are often shipped with defects. Whenever a bug is reported, developers use the information available in the associated report to locate source code fragments that need to be modified in order to fix the bug. However, as software systems evolve in size and complexity, bug localization can become a tedious and time-consuming process. To minimize the manual effort, contemporary bug localization tools utilize Information Retrieval (IR) methods for automated support. IR methods exploit the textual content of bug reports to automatically capture and rank relevant buggy source files. ObjectiveIn this paper, we propose a new paradigm of information-theoretic IR methods to support bug localization tasks in software systems. These methods, including Pointwise Mutual Information (PMI) and Normalized Google Distance (NGD), exploit the co-occurrence patterns of code terms in the software system to reveal hidden textual semantic dimensions that other methods often fail to capture. Our objective is establish accurate semantic similarity relations between source code and bug reports. MethodFive benchmark datasets from different application domains are used to conduct our analysis. The proposed methods are compared against classical IR methods that are commonly used in bug localization research. ResultsThe results show that information-theoretic IR methods significantly outperform classical IR methods, providing a semantically aware, yet, computationally efficient solution for bug localization in large and complex software systems. (A replication package is available at: http://seel.cse.lsu.edu/data/ist17.zip). ConclusionsInformation-theoretic co-occurrence methods provide “just enough semantics” necessary to establish relations between bug reports and code artifacts, achieving a balance between simple lexical methods and computationally-expensive semantic IR methods that require substantial amounts of data to function properly.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.