Abstract

In recent years, the research based on K-Nearest Neighbor(KNN) text classification has always been a hot spot. There are many kinds of improved KNN text classification. Naturally, most scholars have combined the rough sets with the text classification, but few scholars has researched the KNN text classification based on the three-way decisions. Because the distribution of a class is ambiguous, some of the articles in some categories are difficult to be categorized accurately. In order to solve the problem of unambiguous label determination, this paper proposes an algorithm about the text classification based on Three-Way Decisions with KNN(TWDKNN). The minimum risk cost model about the three-way decisions theory is used to set the threshold, and the three-way decisions are transformed into two decisions, and the membership function is redefined. Therefore, the definition of this paper narrows the search range of K-Nearest Neighbor and solves the problem of fuzzy tag judgment. The experimental results show that the classification accuracy rate, recall rate and F value are obviously improved compared with the traditional KNN text classification algorithm. Experiments show that TWDKNN has a certain improvement in the performance of text classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call