Abstract

ABSTRACT The orthogonal projection method has made significant progress in text classification, especially in generating discriminative features. This method obtains more pure and suitable for classification features by projecting text features onto the orthogonal direction of common features (which are not helpful for classification and actually confuse performance). However, this approach requires an additional branch network to generate these common features, which reduces the flexibility of this method compared to representation optimisation methods such as self-attention mechanisms, as it requires significant modification of the base network structure to use. To address this issue, this paper proposes the Inversed Attention Orthogonal Projection Module (IAOPM). IAOPM uses inversed attention (IA) to iteratively reverse the attention map on text features, encouraging the network to remove discriminating features from the text features and obtain potential common features. Unlike the original orthogonal projection method, IAOPM can extract common features within a single network without any branch networks, increasing the flexibility of the orthogonal projection method. We also use an orthogonal loss to ensure the quality of the common features during training, so IAOPM also has better purity performance than the original method. Experiments show that text classification models based on IAOPM outperform the baseline models, self-attention mechanisms, and the original orthogonal projection method on multiple text classification datasets with an average accuracy increase of 1.02%, 0.44%, and 0.52%, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call