Text Difficulty Classification by Combining Machine Learning and Language Features

Han Ding,Qiyu Zhong,Liu Yang,Shaohong Zhang

doi:10.1007/978-3-030-89698-0_108

Abstract

AbstractIn recent years the application of deep learning algorithms has dominated this research field on document readability prediction. Traditional methods rely excessively on manual feature extraction, and modern deep learning algorithms are severely time consuming in terms of efficiency in deep feature extraction. On the other hand, there is a considerable lack of common datasets for readability studies in the relevant English language education field. In view of these problems, we proposed an improved approach for corpus construction and reconstructed a more reasonable dataset (English Teaching Texts in China, CETT) with five-level difficulty by purposefully integrating and adding some missing datasets. Based on CETT, we extracted three dimensions of text word frequency features, linguistic difficulty features and in-depth features. We compared the effectness of combining features of different features for difficulty prediction under multiple classifiers. The final experimental results show that the accuracy of fused linguistic difficulty features and in-depth features on the text difficulty assessment task reaches 0.9011, and the fused features have an overall significant improvement over using individual features.KeywordsMachine learningReadability assessmentTransformerDifficulty classification

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Text Difficulty Classification by Combining Machine Learning and Language Features

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Deep learning and its application in geochemical mapping
Renguang Zuo ... Emmanuel John M Carranza
Earth-Science Reviews | VOL. 192
Renguang Zuo, et. al.Renguang Zuo ... Emmanuel John M Carranza
04 Mar 2019
Earth-Science Reviews | VOL. 192

Chapter 3 - Deep learning for multisource medical information processing
Mavis Gezimati ... Ghanshyam Singh
Data Fusion Techniques and Applications for Smart Healthcare | VOL. -
Mavis Gezimati, et. al.Mavis Gezimati ... Ghanshyam Singh
01 Jan 2024
Data Fusion Techniques and Applications for Smart Healthcare | VOL. -

Artificial Intelligence and Ophthalmic Clinical Registries
Luke Tran ... Stephanie L Watson
American Journal of Ophthalmology | VOL. 268
Luke Tran, et. al.Luke Tran ... Stephanie L Watson
05 Aug 2024
American Journal of Ophthalmology | VOL. 268

Detecting Elderly Behaviors Based on Deep Learning for Healthcare: Recent Advances, Methods, Real-World Applications and Challenges
Mubarak Almutairi ... Lubna A Gabralla
IEEE Access | VOL. 10
Mubarak Almutairi, et. al.Mubarak Almutairi ... Lubna A Gabralla
01 Jan 2021
IEEE Access | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text Difficulty Classification by Combining Machine Learning and Language Features

Abstract

Talk to us

Similar Papers