Quora Question Pairs Identification and Insincere Questions Classification

Sai Surya Teja Gontumukkala,Bhanu Rama Ravi Teja Gonugunta,Suja Palaniswamy,Yogeshwara Sai Varun Godavarthi,Deepa Gupta

doi:10.1109/icccnt54827.2022.9984492

Sai Surya Teja Gontumukkala, Bhanu Rama Ravi Teja Gonugunta + Show 3 more

https://doi.org/10.1109/icccnt54827.2022.9984492

Copy DOI

Publication Date: Oct 3, 2022

Citations: 2

Affiliation: Amrita Vishwa Vidyapeetham University

Abstract

Quora is a question-answering site where people ask questions and reply to the existing questions which makes Quora a great interactive platform but it also has few challenges such as the occurrence of duplicate questions which lead to ambiguity and insincere questions that degrade the value of the site. In this research work, we have proposed a method to overcome these two challenges by using techniques of Natural Language Processing (NLP) and Deep Learning (DL). Five different word embeddings were used for both the problems, Bidirectional Long Short-Term Memory (BiLSTM) and Bi-Gated Recurrent Unit (BiGRU) architecture with attention mechanism were used for insincere question classification and Siamese Manhattan Long Short-Term Memory (MaLSTM) architecture were used for question pairs identification. The implemented models are performing well, in terms of accuracy, precision, recall, and F1 Score. Our research work has achieved the highest accuracy of 90% and highest F1 score of 0.89 by using Paraphrase-MiniLM-L6-v2 + Siamese MaLSTM for Quora Question Pairs Identification and for Insincere Questions Classification our model achieved the highest accuracy of 95% and highest F1 score of 0.82 by using FastText + BiLSTM + BiGRU. Our results were compared with literature and our research work has outperformed baseline models.

Full Text