Abstract

Text classification is an important tasks in natural language processing. Multilayer attention networks have achieved excellent performance in text classification tasks, but they also face challenges such as high temporal and spatial complexity levels and low-rank bottleneck problems. This paper incorporates spatial attention into a neural network architecture that utilizes fewer encoder layers. The proposed model aims to enhance the spatial information of semantic features while addressing the high temporal and spatial demands of traditional multilayer attention networks. This approach utilizes spatial attention to selectively weigh the relevance of the spatial locations in the input feature maps, thereby enabling the model to focus on the most informative regions while ignoring the less important regions. By incorporating spatial attention into a shallower encoder network, the proposed model achieves improved performance on spatially oriented tasks while reducing the computational overhead associated with deeper attention-based models. To alleviate the low-rank bottleneck problem of multihead attention, this paper proposes a variable multihead attention mechanism, which changes the number of attention heads in a layer-by-layer manner with the encoder, achieving a balance between expression power and computational efficiency. We use two Chinese text classification datasets and an English sentiment classification dataset to verify the effectiveness of the proposed model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call