Abstract

Centroid-Based Classifier (CBC) is one of the most widely used text classification method due to its theoretical simplicity and computational efficiency. However, the accuracy of CBC is not satisfactory when it deals with the skewed distributed data. In this paper, we propose a new classification model named as Gravitation Model (GM) to solve the model misfit of CBC. In the proposed model, we give each category a mass factor to indicate its distribution in vector space and this factor can be learned from training data. We provide the performance comparisons with CBC and its improved methods based on the results of experiments conducted on twelve real datasets, which show that the proposed gravitation model consistently outperforms CBC. Furthermore, it reaches the same performance as the best centroid-based classifier and is more stable than the best one.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.