Abstract
Feedback from software users is vital for engineering better software requirements. One tool for extracting requirements from online user feedback is clustering, where the most mentioned topics are found by grouping similar feedback together. For these topics to be understood, clusters have been summarized in previous work using characterizing phrases or sentences. This work evaluates which method of characterization (unigrams, bigrams, trigrams, or sentences) is most effective for understanding the semantic meaning of a whole cluster using feedback from multiple feedback sources. We evaluate multiple characterization methods to determine the ability of each method to create distinct, descriptive characterizations. We further evaluate the amount of requirements relevant characterizations created by each characterization method. We find that unigrams, bigrams, trigrams, and full sentences all perform similarly in distinguishing clusters from each other. However, we find that fewer and more expressive characterizations, such as full sentences, contain more requirements relevant information from a feedback cluster compared to more numerous but less expressive unigrams, meaning a sentence will better summarize the important requirement relevant information from a cluster. Our findings inform the future development of user feedback clustering tools, with different cluster characterization methods being quantitatively measured for the first time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.