Abstract

Community questions answering sites receive a huge number of questions and answers everyday. It has been observed that a number of questions among them are marked as closed by the site moderators. Such questions increase overhead of the moderators and also create user dissatisfaction. This paper aims to predict whether a newly posted question would be marked as closed in the future or not and also give a tentative reason of being closed. Two models: (1) a baseline model based on traditional machine learning techniques and (2) deep learning models such as convolutional neural network (CNN) and long short-term memory (LSTM) network are used to classify a question into one of the five classes: (1) open, (2) off-topic, (3) not a real question, (4) too constructive and (5) too localized. The baseline model requires the handcrafted features and hence does not preserve semantics. However, CNN and LSTM networks are capable of preserving the semantics of question’s word and extracting the hidden features from the textual content using multiple hidden layers. The LSTM network performs better compared to CNN and traditional machine learning models. The proposed model can be used as an initial filter to screen the closed question at the time of posting, which reduced the overheads of site moderators. To the best of our knowledge, this is the first work that predicts the closed question along with the reason the question will be closed. This helps the questioner to modify the question before posting. The experimental results with the dataset of Stack Overflow prove the effectiveness of the proposed model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.