Abstract

With the rapid advancement of the internet, there has been a dramatic increase in short-text data. Due to the brevity of short texts, sparse features, and limited contextual information, short-text classification has become a challenging task in natural language processing. However, current methods primarily capture semantic information from locally-sequenced words in short text, which ignores the intricate feature relationships that pervade both the intra-text and inter-text. Therefore, this paper proposes a novel Edge-Enhanced Minimum-Margin Graph Attention Network (EMGAN) for short text classification to address this issue. Specifically, we construct a Heterogeneous Information Graph (HIG) to represent complex relationships among short text features. HIG mainly considers the relationship between document features and three attribute features, such as entities, topics, and keywords, and can represent short text features from multiple dimensions and levels. Then, to enhance the connectivity and expressiveness of the HIG for more effective propagation of feature information within it, we present a novel X-shaped structure edge-enhancement method. It enriches their relationships by reconstructing the edge structures. Furthermore, we design a Minimum Margin Graph Attention Network (MMGAN) for short text classification. Specifically, this method aims to explore the minimum margin between high-order neighbors and central nodes at the minimum cost, efficiently extracting and aggregating feature information. Extensive experimental results demonstrate that our proposed EMGAN model outperforms existing methods on five datasets, validating its effectiveness in short-text classification. Our code is submitted at https://github.com/w123yy/EMGAN.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.