Abstract
In this work, sentence similarity between sentences of software bug report is computed. For this, two methods are utilized, Latent Semantic Analysis and Text Rank. Latent Semantic Analysis is used to compute semantic similarity between sentences of bug reports which infers deeper and hidden relation between words. From this, a pair of sentences with semantic similarity above a set threshold is selected and from one pair of sentences, only one sentence is selected. The remaining sentences are passed into TextRank algorithm and sentences with high similarity are further selected to generate a coherent summary. The proposed approach is evaluated on a newly constructed Apache Project Bug Report Corpus and existing Bug Report Corpus. The proposed approach is also compared with baseline approaches that mainly focus on only lexical similarity. The results when evaluated on Apache project Bug Report Corpus attains an average value of 80%, 72.57%, 76.05% and 76.57% in terms of precision, recall, F-score and pyramid precision respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.