Abstract

With the generation of enormous data day by day, the need of feature reduction has tremendously increased in the field of text classification. In this direction, this paper presents two text classification systems, called concept-based mining model using threshold (CMMT) and fuzzy similarity-based concept mining model using feature clustering (FSCMM-FC). Both systems aim to classify the English text documents into pre-defined mutually exclusive categories. These systems preprocess the documents at the sentence, document, and integrated corpora levels; apply feature extraction and reduction; train the classifier; and finally, classify the documents using support vector machine. CMMT cuts off the less frequent features by applying threshold on the extracted features, whereas FSCMM-FC reduces the features by finding the feature points using fuzzy C-means. The experimental results obtained 95.8% and 94.695% feature reduction in CMMT and FSCMM-FC, respectively, and also the 85.41% and 93.43% classification accuracy in CMMT and FSCMM-FC, respectively. Therefore, these results state that FSCMM-FC outperformed CMMT greatly with effective memory usage and efficient classification accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.