Abstract
Abstract. The swift expansion of online textual data has rendered text classification increasingly vital in information management. Despite the prevalent usage of the chi-square test in text classification, there has been a scarcity of thorough research regarding its specific uses in recent years. Therefore, it is vital to encapsulate the research about the use of the chi-square test in text classification throughout the last five years. This report reviews the application of the chi-square statistic in Arabic text classification, social media data analysis, and medical literature classification and analyses its effectiveness in feature selection and enhancing classification performance. By reviewing and analyzing the academic literature, this report summarizes the application of improved chi-square feature selection methods to different text data types. It explores the effectiveness of these methods in improving classification accuracy. The findings indicate that chi-square has significant advantages in text classification in different domains, especially when dealing with complex linguistic texts and user-generated content.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have