Abstract

Handling bug reports is an important issue in software maintenance. Recently, detection on duplicate bug reports has received much attention. There are two main reasons. First, duplicate bug reports may waste human resource to process these redundant reports. Second, duplicate bug reports may provide abundant information for further software maintenance. In the past studies, many schemes have been proposed using the information retrieval and natural language processing techniques. In this thesis, we propose a novel detection scheme based on a BM25 term weighting scheme. We have conducted empirical experiments on three open source projects, Apache, ArgoUML, and SVN. The experimental results show that the BM25-based scheme can effectively improve the detection performance in nearly all cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call