Readability Analysis of Bengali Literary Texts

Shanta Phani,Shibamouli Lahiri,Arindam Biswas

doi:10.1080/09296174.2018.1499456

Readability Analysis of Bengali Literary Texts

Shanta Phani, Shibamouli Lahiri + Show 1 more

https://doi.org/10.1080/09296174.2018.1499456

Copy DOI

Journal: Journal of Quantitative Linguistics	Publication Date: Sep 24, 2018
Citations: 4

Affiliation: Indian Institute of Engineering Science and Technology, Shibpur

#Model For Readability #Mean Squared Error + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

ABSTRACTIn this paper we propose a set of novel regression models for readability scoring in Bengali language, which can also be used for Hindi, making use of several lexical, surface-level, syntactic and semantic features. We perform 5-fold and leave-one-out cross-validation on a human-annotated gold standard dataset of 30 passages, written by 4 eminent Bengali litterateurs. On this dataset, our best model achieves a mean squared error (MSE) of 57%, which is better than state-of-the-art results (73% MSE). We further perform feature analysis to identify potentially useful features in learning a regression model for Bengali readability. Ablation studies indicate the importance of compound characters (Juktakkhors) in readability assessment.

Full Text