Abstract
With the rapid growth of information retrieval technology, Chinese text classification, which is the basis of information content security, has become a widely discussed topic. In view of the huge difference compared with English, Chinese text task is more complex in semantic information representations. However, most existing Chinese text classification approaches typically regard feature representation and feature selection as the key points, but fail to take into account the learning strategy that adapts to the task. Besides, these approaches compress the Chinese word into a representation vector, without considering the distribution of the term among the categories of interest. In order to improve the effect of Chinese text classification, a unified method, called Supervised Contrastive Learning with Term Weighting (SCL-TW), is proposed in this paper. Supervised contrastive learning makes full use of a large amount of unlabeled data to improve model stability. In SCL-TW, we calculate the score of term weighting to optimize the process of data augmentation of Chinese text. Subsequently, the transformed features are fed into a temporal convolution network to conduct feature representation. Experimental verifications are conducted on two Chinese benchmark datasets. The results demonstrate that SCL-TW outperforms other advanced Chinese text classification approaches by an amazing margin.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.