Abstract

As the most basic element in English learning, vocabulary has always been the focus of teaching in college English classes, but the teaching effect is often unsatisfactory. In this paper, the genetic algorithm fitness function design part is integrated with the K-medoids algorithm to form K-GA-medoids, and secondly, it is combined with KNN to form an algorithmic framework for English vocabulary classification. In the classification process, clustering and classification steps are taken to realize the reduction of the training set samples and thus reduce the computational overhead. The experiments show that K-GA-medoids have significantly improved the clustering effect compared with traditional K-medoids, and the combination of K-GA-medoids and KNNs has effectively improved the efficiency of English vocabulary classification compared with the traditional KNN algorithm, while ensuring the classification accuracy. We found that students in college English course consider word memorization as a difficult learning task, and the traditional vocabulary teaching methods are not very effective, and the knowledge of etymology is often little known and rarely covered in classroom lectures. Therefore, the article explores new ideas and strategies for teaching vocabulary in college English from the perspective of etymology.

Highlights

  • Vocabulary is the most fundamental element in the learning of almost any language

  • K-medoids is optimized from K-means algorithm, which greatly reduces the influence of noise and outliers on the final clustering effect, but it is still sensitive to the selection of initial clustering centers. is framework clusters the training sample set in the first stage of classification and reduces the training set in the classification stage based on the distance between the English words to be classified and the cluster centers, so as to overcome the problem that the classification efficiency of the traditional KNN algorithm will be seriously reduced and lack of effectiveness when the sample size is large problem [26]

  • Where Ci is the ith cluster of the clustering result, Ni is the size of the ith cluster, Ni denotes the cluster center, |P − Mi| is the distance from sample P to cluster center Mi, and Si is the average of the distances from all sample points in the ith cluster to the cluster center [27]

Read more

Summary

Introduction

Vocabulary is the most fundamental element in the learning of almost any language. For second language learners, vocabulary directly determines their language proficiency and ability [1]. Siregar [15] systematically proposes vocabulary teaching strategies She believes that understanding the etymology of words helps comprehension and acquisition. To address the problem that the traditional KNN has huge computational overhead when the sample size of the training set is large or the sample dimension is high, [17] proposed a convolutional neural network-based English vocabulary classification algorithm CKNN, which first obtains more abstract feature values from short English vocabulary and performs the classification work related to English vocabulary afterwards. Kumar et al [19] proposed a KNN algorithm based on rough sets to reduce the dimensionality of the vectors in the sample space by attribute reduction and improve the efficiency of categorizing English words. Category with the largest weight in the sample set consisting of the K nearest samples [25]

Introduction to the Algorithm Framework
Experimental Results and Analysis
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call