Parts-of-Speech (PoS) Analysis and Classification of Various Text Genres

Akshay Mendhakar,Darshan H S

doi:10.1515/csh-2023-0002

Abstract

Abstract Natural language processing (NLP) has made significant leaps over the past two decades due to the advancements in machine learning algorithms. Text classification is pivotal today due to a wide range of digital documents. Multiple feature classes have been proposed for classification by numerous researchers. Genre classification tasks form the basis for advanced techniques such as native language identification, readability assessment, author identification etc. These tasks are based on the linguistic composition and complexity of the text. Rather than extracting hundreds of variables, a simple premise of text classification using only the text feature of parts-of-speech (PoS) is presented here. A new dataset gathered from Project Gutenberg is highlighted in this study. PoS analysis of each text in the created dataset was carried out. Further grouping of these texts into fictional and non-fictional texts was carried out to measure their classification accuracy using the artificial neural networks (ANN) classifier. The results indicate an overall classification accuracy of 98 and 35 % for the genre and sub-genre classification, respectively. The results of the present study highlight the importance of PoS not only as an important feature for text processing but also as a sole text feature classifier for text classification.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Corpus-based Studies across Humanities	Publication Date: Dec 21, 2023
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Parts-of-Speech (PoS) Analysis and Classification of Various Text Genres

Abstract

Talk to us

Similar Papers

More From: Corpus-based Studies across Humanities

Lead the way for us

Similar Papers

EEG-Based classification of branded and unbranded stimuli associating with smartphone products: comparison of several machine learning algorithms
Abdurrahman Özbeyaz
Neural Computing and Applications | VOL. 33
Abdurrahman ÖzbeyazAbdurrahman Özbeyaz
17 Feb 2021
Neural Computing and Applications | VOL. 33

Application of advanced machine learning algorithms for anomaly detection and quantitative prediction in protein A chromatography
Anamika Tiwari ... Anurag S Rathore
Journal of Chromatography A | VOL. 1682
Anamika Tiwari, et. al.Anamika Tiwari ... Anurag S Rathore
08 Sep 2022
Journal of Chromatography A | VOL. 1682

Machine Learning Algorithms for Optical Remote Sensing Data Classification and Analysis
G. P. Obi Reddy ... K. C. Arun Kumar
-
G. P. Obi Reddy, et. al.G. P. Obi Reddy ... K. C. Arun Kumar
12 Oct 2021
12 Oct 2021

Machine-learning-based modeling of saturated flow boiling in pin-fin micro heat sinks with expanding flow passages
Burak Markal ... Alperen Evcimen
International Communications in Heat and Mass Transfer | VOL. 158
Burak Markal, et. al.Burak Markal ... Alperen Evcimen
27 Jul 2024
International Communications in Heat and Mass Transfer | VOL. 158

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parts-of-Speech (PoS) Analysis and Classification of Various Text Genres

Abstract

Talk to us

Similar Papers

More From: Corpus-based Studies across Humanities