Abstract
Automated sentiment analysis in software engineering textual artifacts has long been suffering from inaccuracies in those few tools available for the purpose. We conduct an in-depth qualitative study to identify the difficulties responsible for such low accuracy. Majority of the exposed difficulties are then carefully addressed through building a domain dictionary and appropriate heuristics. These domain-specific techniques are then realized in SentiStrength-SE, a tool we have developed for improved sentiment analysis in text especially designed for application in the software engineering domain.Using a benchmark dataset consisting of 5,600 manually annotated JIRA issue comments, we carry out both qualitative and quantitative evaluations of our tool. We also separately evaluate the contributions of individual major components (i.e., domain dictionary and heuristics) of SentiStrength-SE. The empirical evaluations confirm that the domain specificity exploited in our SentiStrength-SE enables it to substantially outperform the existing domain-independent tools/toolkits (SentiStrength, NLTK, and Stanford NLP) in detecting sentiments in software engineering text.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.