Contrastive learning based on linguistic knowledge and adaptive augmentation for text classification

Shaokang Zhang,Ning Ran

doi:10.1016/j.knosys.2024.112189

Shaokang Zhang, Ning Ran

https://doi.org/10.1016/j.knosys.2024.112189

Copy DOI

Export

Save

Cite

Journal: Knowledge-Based Systems	Publication Date: Jun 29, 2024
Citations: 1

Abstract
Full-Text
Similar Papers

Abstract

Listen

Pre-trained language models based on contrastive learning have shown to be effective in text classification. Despite its great success, contrastive learning still has some limitations. First, external linguistic knowledge has been shown to improve the performance of pre-trained language models, but how to use it in contrastive learning is still unclear. Second, general contrastive learning generates training samples with fixed data augmentation during the whole training period, while different augmentation methods are suitable for different downstream tasks. Fixed data augmentation can lead to suboptimal settings. In this paper, we propose contrastive learning based on linguistic knowledge and adaptive augmentation, which can obtain high-quality sentence representations to improve the performance of text classification. Specifically, we construct word-level positive and negative sample pairs by WordNet and propose a novel word-level contrastive learning function to inject linguistic knowledge. Then we dynamically select the augmentation policy by alignment and uniformity. This adaptive augmentation policy can acquire more generalized sentence representations with little computational overhead. Experiments on multiple public datasets demonstrate that our method outperforms state-of-the-art methods.

Full Text