Revisiting Textual Feature of Bug-Triage Approach

Zexuan Li,Hao Zhong

doi:10.1109/ase51524.2021.9678863

Abstract

With the increase of software users, programmers use issue tracking systems to manage bug reports and researchers propose bug-triage approaches that assign bug reports to programmers. Programmers assign bug reports often according to their descriptions. Based on by this observation, prior approaches typically use classic natural language processing (NLP) to analyze bug reports. Although the technical choice is straightforward, the true effectiveness of this technical choice is largely unknown. Taking a state-of-the-art approach as an example, we explore the impact of textual features in bug triage. By enabling and disabling the textual features of this approach, we analyze their impacts on assigning thousands bug reports from six widely used open source projects. Our result shows that instead of improving it, textual features in fact reduce the effectiveness. In particular, after we turn off its textual features, the f-scores of the baseline approach are improved by 8%. After manual inspection, we find two reasons to explain our result: (1) classic NLP techniques are insufficient to analyze bug reports, because they are not pure natural language texts and contain other elements (e.g., code samples); and (2) some bug reports are poorly written. Our findings reveal a strong need and sufficient opportunities to explore more advanced techniques to handle these complicated elements in bug reports.

Full Text