A Trade-off between ML and DL Techniques in Natural Language Processing

Bhavesh Singh,Neha Katre,Parth Tank,Himanshu Ashar,Rahil Desai

doi:10.1088/1742-6596/1831/1/012025

Abstract

The domain of Natural Language Processing covers various tasks, such as classification, text generation, and language model. The data processed using word embeddings, or vectorizers, is then trained using Machine Learning and Deep Learning algorithms. In order to observe the tradeoff between both these types of algorithms, with respect to data available, accuracy obtained and other factors, a binary classification is undertaken to distinguish between insincere and regular questions on Quora. A dataset called Quora Insincere Questions Classification was used to train various machine learning and deep learning models. A Bidirectional-Long Short Term Network (LSTM) was trained, with the text processed using Global Vectors for Word Representation (GloVe). Machine Learning algorithms such as Extreme Gradient Boosting classifier, Gaussian Naive Bayes, and Support Vector Classifier (SVC), by using the TF-IDF vectorizer to process the text. This paper also presents an evaluation of the above algorithms on the basis of precision, recall, f1 score metrics.

Full Text