The forgotten role of search queries in IR-based bug localization: an empirical study

Mohammad Masudur Rahman,Foutse Khomh,Shamima Yeasmin,Chanchal K Roy

doi:10.1007/s10664-021-10022-4

Abstract

Being light-weight and cost-effective, IR-based approaches for bug localization have shown promise in finding software bugs. However, the accuracy of these approaches heavily depends on their used bug reports. A significant number of bug reports contain only plain natural language texts. According to existing studies, IR-based approaches cannot perform well when they use these bug reports as search queries. On the other hand, there is a piece of recent evidence that suggests that even these natural language-only reports contain enough good keywords that could help localize the bugs successfully. On one hand, these findings suggest that natural language-only bug reports might be a sufficient source for good query keywords. On the other hand, they cast serious doubt on the query selection practices in the IR-based bug localization. In this article, we attempted to clear the sky on this aspect by conducting an in-depth empirical study that critically examines the state-of-the-art query selection practices in IR-based bug localization. In particular, we use a dataset of 2,320 bug reports, employ ten existing approaches from the literature, exploit the Genetic Algorithm-based approach to construct optimal, near-optimal search queries from these bug reports, and then answer three research questions. We confirmed that the state-of-the-art query construction approaches are indeed not sufficient for constructing appropriate queries (for bug localization) from certain natural language-only bug reports. However, these bug reports indeed contain high-quality search keywords in their texts even though they might not contain explicit hints for localizing bugs (e.g., stack traces). We also demonstrate that optimal queries and non-optimal queries chosen from bug report texts are significantly different in terms of several keyword characteristics (e.g., frequency, entropy, position, part of speech). Such an analysis has led us to four actionable insights on how to choose appropriate keywords from a bug report. Furthermore, we demonstrate 27%–34% improvement in the performance of non-optimal queries through the application of our actionable insights to them. Finally, we summarize our study findings with future research directions (e.g., machine intelligence in keyword selection).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The forgotten role of search queries in IR-based bug localization: an empirical study

Abstract

Talk to us

Similar Papers

More From: Empirical Software Engineering

Lead the way for us

Journal: Empirical Software Engineering	Publication Date: Aug 23, 2021
Citations: 7

Similar Papers

Structured information in bug report descriptions—influence on IR-based bug localization and developers
Michael Rath ... Patrick Mäder
Software Quality Journal | VOL. 27
Michael Rath, et. al.Michael Rath ... Patrick Mäder
08 May 2019
Software Quality Journal | VOL. 27

Improving bug localization with report quality dynamics and query reformulation
Mohammad Masudur Rahman ... Chanchai K Roy
-
Mohammad Masudur Rahman, et. al.Mohammad Masudur Rahman ... Chanchai K Roy
27 May 2018
27 May 2018

How Does Execution Information Help with Information-Retrieval Based Bug Localization?
Tung Dao ... Lingming Zhang
-
Tung Dao, et. al.Tung Dao ... Lingming Zhang
01 May 2017
01 May 2017

Comparing learning to rank techniques in hybrid bug localization
Zhendong Shi ... Xingjun Zhang
Applied Soft Computing | VOL. 62
Zhendong Shi, et. al.Zhendong Shi ... Xingjun Zhang
08 Nov 2017
Applied Soft Computing | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The forgotten role of search queries in IR-based bug localization: an empirical study

Abstract

Talk to us

Similar Papers

More From: Empirical Software Engineering