Abstract
Study the related WEB text feature extraction algorithm, through the mutual information (MI), document frequency (DF), information gain (IG) andχ2 statistics (CHI) algorithm research, using of their respective advantage complementary, proposed a multiple combination feature extraction algorithm based on principal component analysis (PCA-MCFEA). First, by the orthogonal transformation of the PCA algorithm to faster dimensionality reduction of the text feature space; Then through the multiple combination feature extraction algorithm in the lower dimension of feature space fast extract more representative of the feature, filter out some representative weak feature items; Finally, using the SVM classifier to classify the text. The experimental results show that PCA-MCFEA algorithm can effectively improve text classification accuracy and running efficiency.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: DEStech Transactions on Computer Science and Engineering
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.