Multi-view approach to suggest moderation actions in community question answering sites

Issa Annamoradnejad,Jafar Habibi,Mohammadamin Fazli

doi:10.1016/j.ins.2022.03.085

Abstract

With thousands of new questions posted every day on popular Q&A websites, there is a need for automated and accurate software solutions to replace manual moderation. In this paper, we address the critical drawbacks of crowdsourcing moderation actions in Q&A communities and demonstrate the ability to automate moderation using the latest machine learning models. From a technical point, we propose a multi-view approach that generates three distinct feature groups that examine a question from three different perspectives: 1) question-related features extracted using a BERT-based regression model; 2) context-related features extracted using a named-entity-recognition model; and 3) general lexical features derived using statistical and analytical methods. As a last step, we train a gradient boosting classifier to predict a moderation action. For evaluation purposes, we created a new dataset consisting of 60,000 Stack Overflow questions classified into three choices of moderation actions. Based on cross-validation on the novel dataset, our approach reaches 95.6% accuracy as a multiclass task and outperforms all state-of-the-art and previously-published models. Our results clearly demonstrate the high influence of our feature generation components on the overall success of the classifier.

Full Text