Abstract
[Background] Recent research on mining app reviews for software evolution indicated that the elicitation and analysis of user requirements can benefit from supplementing user reviews by data from other sources. However, only a few studies reported results of leveraging app changelogs together with app reviews. [Aims] Motivated by those findings, this exploratory experimental study looks into the role of app changelogs in the classification of requirements derived from app reviews. We aim at understanding if the use of app changelogs can lead to more accurate identification and classification of functional and non-functional requirements from app reviews. We also want to know which classification technique works better in this context. [Method] We did a case study on the effect of app changelogs on automatic classification of app reviews. Specifically, manual labeling, text preprocessing, and four supervised machine learning algorithms were applied to a series of experiments, varying in the number of app changelogs in the experimental data. [Results] We compared the accuracy of requirements classification from app reviews, by training the four classifiers with varying combinations of app reviews and changelogs. Among the four algorithms, Naïve Bayes was found to be more accurate for categorizing app reviews. [Conclusions] The results show that official app changelogs did not contribute to more accurate identification and classification of requirements from app reviews. In addition, Naïve Bayes seems to be more suitable for our further research on this topic.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have