English Word Difficulty Classifier Based on Random Forest Model

doi:10.25236/ajcis.2023.060622

English Word Difficulty Classifier Based on Random Forest Model

Open Access

https://doi.org/10.25236/ajcis.2023.060622

Copy DOI

Journal: Academic Journal of Computing & Information Science	Publication Date: Jan 1, 2023
Citations: 1

#Difficulty Of Words #Random Forest Model + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Recently, Wordle has become popular worldwide as a daily puzzle game launched by the New York Times. Players try to solve the puzzle by guessing a five-letter word in six tries or less. According to Wordle's statistical data, this paper first uses the K-means algorithm to cluster the difficulty of solution words to quantify the difficulty of English words and analyzes the accuracy and scientificity of the clustering results. Then, the paper uses the Random Forest model to classify the difficulty of words into three categories: ‘easy’, ‘normal’ and ‘hard’. The results show that the classification accuracy on the training set and the test set reaches 0.972 and 0.978 respectively.

Full Text