Abstract

AbstractNews categorization is the task of automatically assigning the news articles or headlines to a particular class. The proliferation of social media and various web 2.0 platforms usage has resulted in substantial textual online content. The majority of this textual data is unstructured, which is extremely hard and time-consuming to organize, manipulate, and manage. Due to the fast and cost-effective nature, automatic news classification has attained increased attention from news agencies in recent years. This paper introduces a deep learning-based framework using multilayer perceptron (MLP) to classify Bengali news articles and headlines into multiple categories:accident, crime, entertainment, and sports. Due to the unavailability of the Bengali news corpus, this work also developed a dataset containing 76343 news articles and 76343 headlines. Additionally, this work investigates the performance of the proposed classifier using five-word embedding techniques. The comparative analysis reveals that MLP with Keras embedding layer outperformed the other embedding models achieving the highest accuracy of 98.18% (news articles) and 94.53% (news headlines).KeywordsNatural language processingText classificationNews categorizationDeep learningNews corpus

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call