Large-scale text classification with deeper and wider convolution neural network

Wei Huang,Min Huang

doi:10.1504/ijspm.2020.10028731

Abstract

The dominant approaches for most natural language processing (NLP) tasks like text classification are recurrent neural networks (RNNs) and convolutional neural networks (CNNs). These architectures are usually shallow and only have one or two layers, which cannot easily extract inner patterns in natural language. Different from the original feature of image pixels with regularity, words and phrases are highly abstracted from human knowledge without direct correlation. Shallow models only capture the surface relation between them while deep models cannot directly apply them. Therefore, a shuffle convolution neural network (SCNN) is proposed to address the shallow learning problem by introducing wider inception cell and deeper residual connection. In the paper, the difficulty of applying deep models to NLP problems is overcome by tricks of shuffling channel input and reshaping the output dimension in the first layer. The results of the experiments carried out in this research work demonstrate that the proposed SCNN makes a great improvement in accuracy and efficiency compared to shallow models.

Full Text