Multi-label text classification based on semantic-sensitive graph convolutional network

Delong Zeng,Enze Zha,Jiayi Kuang,Ying Shen

doi:10.1016/j.knosys.2023.111303

Abstract

Multi-Label Text Classification (MLTC) is an important but challenging task in the field of natural language processing. In this paper, we propose a novel method, Semantic-sensitive Graph Convolutional Network (S-GCN), by simultaneously considering semantic and word-global associations. More specifically, we first leverage texts, words, and labels to construct a global graph, which helps mine the relevance between similar documents. Then we design and pre-train an encoder to initialize text nodes in the graph, from which the semantic features of documents are extracted. Next, we employ a graph convolutional network to classify text nodes, which can well fuse node information. Finally, we normalize the adjacency matrix and store hidden layer representations of word nodes, tackling the issue that conventional graph-based methods cannot predict texts that did not appear during training. We conduct experiments on three public datasets, AAPD, RMSC-V2, and Reuters-21578, and demonstrate the superiority of our model over the baselines on the MLTC task. Source code is available at https://github.com/sysu18364004/SGCN.

Full Text