Abstract

In recent years, the research based on K-Nearest Neighbor(KNN) text classification has always been a hot spot. There are many kinds of improved KNN text classification. Naturally, most scholars have combined the rough sets with the text classification, but few scholars has researched the KNN text classification based on the three-way decisions. Because the distribution of a class is ambiguous, some of the articles in some categories are difficult to be categorized accurately. In order to solve the problem of unambiguous label determination, this paper proposes an algorithm about the text classification based on Three-Way Decisions with KNN(TWDKNN). The minimum risk cost model about the three-way decisions theory is used to set the threshold, and the three-way decisions are transformed into two decisions, and the membership function is redefined. Therefore, the definition of this paper narrows the search range of K-Nearest Neighbor and solves the problem of fuzzy tag judgment. The experimental results show that the classification accuracy rate, recall rate and F value are obviously improved compared with the traditional KNN text classification algorithm. Experiments show that TWDKNN has a certain improvement in the performance of text classification.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.