Feature extraction of public English vocabulary text based on wavelet transform

Shengyan Qiu

doi:10.1504/ijris.2022.125453

Abstract

Aiming at the problems of low extraction accuracy and long time-consuming in the existing public English vocabulary text feature extraction methods, a public English vocabulary text feature extraction method based on wavelet transform is proposed. Set the existence space of public English vocabulary text, convert the text into vector mode, and cluster the text vector by clustering and fusion method. The distance between lexical text vectors in space is calculated, the similarity between vectors is determined, and the noise in text data is removed by information gain method; divide several small bands of different sizes, determine the change frequency of small band vocabulary text data, integrate the eigenvalues of different bands of text, and carry out wavelet transform on the integrated data features to obtain the transformation result, which is the text feature. The results show that the accuracy of the proposed method is 96%.

Full Text