Abstract

Software bug analysis based on the information retrieval (IR) technology is widely studied and used for bug understanding, localization and fixing. IR technology with various textual feature extraction methods formulates the textual information in a given new bug report (i.e., title and description) as an initial query. However, due to the low-quality content in the new bug report and improper representation to be used as a query, the retrieval results are usually not satisfactory.To alleviate these problems, we propose a novel knowledge-aware bug report reformulation approach (a.k.a, KABR) by leveraging multi-level embeddings from the bug data. First, we construct a bug-specific knowledge graph (KG) to manage and reuse prior knowledge extracted from historical bug reports. Then, we extract word embedding from the original bug data, entity embedding and context embedding from the bug-specific KG to enhance the initial query. Finally, a new query representation is generated by leveraging multi-level embeddings through Convolutional Neural Networks (CNN) with the self-attention mechanism. We evaluate KABR based on the duplicate bug report detection task, and the experimental results show that KABR achieves 6%–11% F1-measure improvement over the state-of-the-art approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call