Abstract

The role of sentiment analysis is increasingly emerging to study software developers' emotions by mining crowd-generated content within software repositories and information sources. With a few notable exceptions [1][5], empirical software engineering studies have exploited off-the-shelf sentiment analysis tools. However, such tools have been trained on non-technical domains and general-purpose social media, thus resulting in misclassifications of technical jargon and problem reports [2][4]. In particular, Jongeling et al. [2] show how the choice of the sentiment analysis tool may impact the conclusion validity of empirical studies because not only these tools do not agree with human annotation of developers' communication channels, but they also disagree among themselves. Our goal is to move beyond the limitations of off-the-shelf sentiment analysis tools when applied in the software engineering domain. Accordingly, we present Senti4SD, a sentiment polarity classifier for software developers' communication channels. Senti4SD exploits a suite of lexicon-based, keyword-based, and semantic features for appropriately dealing with the domain-dependent use of a lexicon. We built a Distributional Semantic Model (DSM) to derive the semantic features exploited by Senti4SD. Specifically, we ran word2vec [3] on a collection of over 20 million documents from Stack Overflow, thus obtaining word vectors that are representative of developers' communication style. The classifier is trained and validated using a gold standard of 4,423 Stack Overflow posts, including questions, answers, and comments, which were manually annotated for sentiment polarity. We release the full lab package2, which includes both the gold standard and the emotion annotation guidelines, to ease the execution of replications as well as new studies on emotion awareness in software engineering. To inform future research on word embedding for text categorization and information retrieval in software engineering, the replication kit also includes the DSM. Results. The contribution of the lexicon-based, keyword-based, and semantic features is assessed by our empirical evaluation leveraging different feature settings. With respect to SentiStrength [6], a mainstream off-the-shelf tool that we use as a baseline, Senti4SD reduces the misclassifications of neutral and positive posts as emotionally negative. Furthermore, we provide empirical evidence of better performance also in presence of a minimal set of training documents.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.