Research on text classification algorithm based on hybrid neural network

Zhiyuan Qiu,Donghong Qin,Chen Xu,Yongquan Yan

doi:10.1117/12.2640881

Abstract

In text classification tasks, using traditional char granularity or word granularity information to encode context information into the same parameter space cannot effectively extract enough text feature information, and the expressive ability of text semantic information is also relatively weak. In order to solve this problem, this paper adopts the embedding method of fusing graphemes and morphemes, and establishes a text classification model that combines a hybrid self-attention mechanism and RCNN. The model first divides the text into char sequences and word sequences and fuses them, and then uses the Transformer encoder to extract preliminary features. The self-attention mechanism in the Transformer can better focus on semantic and location information, and then uses RCNN to extract the deep features which fusion grapheme and morpheme information, the feature vector is calculated and output by the softmax function, and finally the classification of Chinese short texts is realized. The experimental results show that the method proposed in this paper has different degrees of improvement in the accuracy of text classification compared with a single embedding method and a single neural network.

Full Text