Short Text Classification Research Articles

AbstractShort text classification has been widely used in many fields. Due to the scarcity of labelled data, implementing short text classification under semi‐supervised learning setting has become increasingly popular. Semi‐supervised short text classification methods based on graph neural networks can achieve state‐of‐the‐art classification performance by utilizing the expressive power of graph neural networks. However, these methods usually fail to mine the hidden patterns of a large amount of short text node data in the graph to optimize the short text node embeddings, which limits the semantic representation power of the short texts, thus leading to suboptimal classification performance. To overcome the limitation, this paper proposes a novel semi‐supervised short text classification method called the Heterogeneous Graph Contrastive Learning with Adaptive Data Augmentation (HGCLADA). In the knowledge bases guided soft prompt‐based data augmentation component, the related words of the tag words are used to optimize the soft prompts for generating diverse augmented samples. In the heterogeneous graph contrastive learning framework component, a heterogeneous graph that is constructed using short texts and keywords and an effective edge augmentation scheme based on a short text clustering algorithm are proposed. The optimized short text embeddings can be obtained to achieve the effective semi‐supervised short text classification. Extensive experiments on six benchmark datasets show that our HGCLADA method outperforms four classes of state‐of‐the‐art methods in terms of classification accuracy, especially with significant performance improvements of 8.74% on the TagMyNews dataset when each class only contains 20 labelled data.

Read full abstract

With the rapid advancement of the internet, there has been a dramatic increase in short-text data. Due to the brevity of short texts, sparse features, and limited contextual information, short-text classification has become a challenging task in natural language processing. However, current methods primarily capture semantic information from locally-sequenced words in short text, which ignores the intricate feature relationships that pervade both the intra-text and inter-text. Therefore, this paper proposes a novel Edge-Enhanced Minimum-Margin Graph Attention Network (EMGAN) for short text classification to address this issue. Specifically, we construct a Heterogeneous Information Graph (HIG) to represent complex relationships among short text features. HIG mainly considers the relationship between document features and three attribute features, such as entities, topics, and keywords, and can represent short text features from multiple dimensions and levels. Then, to enhance the connectivity and expressiveness of the HIG for more effective propagation of feature information within it, we present a novel X-shaped structure edge-enhancement method. It enriches their relationships by reconstructing the edge structures. Furthermore, we design a Minimum Margin Graph Attention Network (MMGAN) for short text classification. Specifically, this method aims to explore the minimum margin between high-order neighbors and central nodes at the minimum cost, efficiently extracting and aggregating feature information. Extensive experimental results demonstrate that our proposed EMGAN model outperforms existing methods on five datasets, validating its effectiveness in short-text classification. Our code is submitted at https://github.com/w123yy/EMGAN.

Read full abstract

Short Text Classification Research Articles

Related Topics

Articles published on Short Text Classification

Limitations of Feature Attribution in Long Text Classification of Standards

Application of Short Text Classification Model Based on GPT-3.5 in E-Commerce

Improving Text Classification in Agricultural Expert Systems with a Bidirectional Encoder Recurrent Convolutional Neural Network

Heterogeneous graph contrastive learning with adaptive data augmentation for semi‐supervised short text classification

Topic-aware cosine graph convolutional neural network for short text classification

Knowledge and separating soft verbalizer based prompt-tuning for multi-label short text classification

A head-to-head attention with prompt text augmentation for text classification

ALBERT4Spam: A Novel Approach for Spam Detection on Social Networks

Edge-enhanced minimum-margin graph attention network for short text classification

Chinese medical short text classification model based on DPECNN

Improved Graph Contrastive Learning for Short Text Classification

Short text classification using semantically enriched topic model

Heterogeneous Graph-Convolution-Network-Based Short-Text Classification

A multi‐label social short text classification method based on contrastive learning and improved ml‐KNN

Short Text Classification Model based on Pre-trained Language Model with Feature Fusion

Short text classification with Soft Knowledgeable Prompt-tuning

Research on Short Text Information Mining and Classification Methods for Social Media

A Multiscale Interactive Attention Short Text Classification Model Based on BERT

Short Text Classification Based on Hybrid Semantic Expansion and Bidirectional GRU (BiGRU) based Method to Improve Hate Speech Detection

A Hybrid Model with New Word Weighting for Fast Filtering Spam Short Texts.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Short Text Classification Research Articles

Related Topics

Articles published on Short Text Classification

Limitations of Feature Attribution in Long Text Classification of Standards

Application of Short Text Classification Model Based on GPT-3.5 in E-Commerce

Improving Text Classification in Agricultural Expert Systems with a Bidirectional Encoder Recurrent Convolutional Neural Network

Heterogeneous graph contrastive learning with adaptive data augmentation for semi‐supervised short text classification

Topic-aware cosine graph convolutional neural network for short text classification

Knowledge and separating soft verbalizer based prompt-tuning for multi-label short text classification

A head-to-head attention with prompt text augmentation for text classification

ALBERT4Spam: A Novel Approach for Spam Detection on Social Networks

Edge-enhanced minimum-margin graph attention network for short text classification

Chinese medical short text classification model based on DPECNN

Improved Graph Contrastive Learning for Short Text Classification

Short text classification using semantically enriched topic model

Heterogeneous Graph-Convolution-Network-Based Short-Text Classification

A multi‐label social short text classification method based on contrastive learning and improved ml‐KNN

Short Text Classification Model based on Pre-trained Language Model with Feature Fusion

Short text classification with Soft Knowledgeable Prompt-tuning

Research on Short Text Information Mining and Classification Methods for Social Media

A Multiscale Interactive Attention Short Text Classification Model Based on BERT

Short Text Classification Based on Hybrid Semantic Expansion and Bidirectional GRU (BiGRU) based Method to Improve Hate Speech Detection

A Hybrid Model with New Word Weighting for Fast Filtering Spam Short Texts.