Abstract

Handling duplicate bug reports in a software bug repository is a very tedious and time-consuming task. Duplicate bug identification can lead to delay in development and deployment of software. It is also a major problem to be addressed in development of open-source software. Duplicate bug report identification is one of the popular research topics in mining of software repositories in the last decade. Timely identification of duplicate bug reports is an essential requirement of faster software development. Most of the software bug repositories are managed online so retrieving bug reports and finding duplicates are a challenging task. Researchers have proposed numerous solutions to handle the problem of identification of duplicate bug reports in a software bug repository in recent times. Machine learning techniques are utilized efficiently in the majority of the presented works. The key information of a software bug report is of textual data type. The important textual attributes are summary and description. Summary, which is also known as title of a software bug, represents the summarized information of a software bug, whereas description represents the detailed information of software bug. The steps by which the bug can be reproduced are also mentioned in the description of a software bug. By considering and analyzing both these attributes only, one can identify whether the bug report is duplicate to earlier reported bugs or not? In some of the exceptional works only one attribute, that is, summary (or title) is considered for detection of duplicates in the bug repositories. As both of the key attributes summary and description are textual in nature, text mining and natural language processing play a major role in identification of duplicate bug reports. This chapter focuses on presenting a systematic literature review of existing techniques of identification and management of duplicate bug reports in a software bug repository. Finding of existing machine learning and other advanced techniques for duplicate bug report identification is summarized in this chapter. Challenges and future scope in the direction of duplicate bug reports identification in software bug repositories are also discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call