TF-IDF combined rank factor Naive Bayesian algorithm for intelligent language classification recommendation systems

Yonglian Luo,Cailin Lu

doi:10.1016/j.sasc.2024.200136

Abstract

With the continuous improvement of smart language systems, a large amount of language text data has emerged. How to efficiently and accurately process these text data has become an important challenge. Therefore, a speech classification recommendation system based on improved Naive Bayesian algorithm is proposed. The system first adopts the traditional Bayesian algorithm to classify language texts. The Term Frequency-Inverse Document Frequency and rank factor are combined to increase the weight of feature languages. Then, the classified language texts are combined with the improved algorithm for language classification recommendation. Finally, performance testing and simulation applications are conducted on the system. From the results, in the Gutenberg corpus, the research algorithm had the highest accuracy and completeness, with 98.5 % and 91.6 %, respectively, and the lowest values were 92.6 % and 89.4 %. The average values were 95.5 % and 91.1 %, with an F1 value of about 92.6 %. In the Brown corpus, the average accuracy, completeness, and F1 value of the designed algorithm were 96.2 %, 91.2 %, and 93.2 %, respectively. When the number of online customers reached 1000, the response time of the designed Chinese system was 1.15 s, the classification recommendation accuracy was 95 %, and the system stability was about 83 % on average. The response time of the English system was 0.64 s, the classification recommendation accuracy was 96 %, and the system stability was about 90 % on average. It shows that the designed method can significantly enhance the operation accuracy of the classification recommendation system.

Full Text