Semi-supervised Text Classification Research Articles

The goal of semi-supervised text classification (SSTC) is to train a model by exploring both a small number of labeled data and a large number of unlabeled data, such that the learned semi-supervised classifier performs better than the supervised classifier trained on solely the labeled samples. Pseudo-labeling is one of the most widely used SSTC techniques, which trains a teacher classifier with a small number of labeled examples to predict pseudo labels for the unlabeled data. The generated pseudo-labeled examples are then utilized to train a student classifier, such that the learned student classifier can outperform the teacher classifier. Nevertheless, the predicted pseudo labels may be inaccurate, making the performance of the student classifier degraded. The student classifier may perform even worse than the teacher classifier. To alleviate this issue, in this paper, we introduce a dual meta-learning ( DML ) technique for semi-supervised text classification, which improves the teacher and student classifiers simultaneously in an iterative manner. Specifically, we propose a meta-noise correction method to improve the student classifier by proposing a Noise Transition Matrix (NTM) with meta-learning to rectify the noisy pseudo labels. In addition, we devise a meta pseudo supervision method to improve the teacher classifier. Concretely, we exploit the feedback performance from the student classifier to further guide the teacher classifier to produce more accurate pseudo labels for the unlabeled data. In this way, both teacher and student classifiers can co-evolve in the iterative training process. Extensive experiments on four benchmark datasets highlight the effectiveness of our DML method against existing state-of-the-art methods for semi-supervised text classification. We release our code and data of this paper publicly at https://github.com/GRIT621/DML.

Read full abstract

In recent years there has been an increasing use of satellite Earth observation (EO) data in dengue research, in particular the identification of landscape factors affecting dengue transmission. Summarizing landscape factors and satellite EO data sources, and making the information public are helpful for guiding future research and improving health decision-making. In this case, a review of the literature would appear to be an appropriate tool. However, this is not an easy-to-use tool. The review process mainly includes defining the topic, searching, screening at both title/abstract and full-text levels and data extraction that needs consistent knowledge from experts and is time-consuming and labor intensive. In this context, this study integrates the review process, text scoring, active learning (AL) mechanism, and bidirectional long short-term memory (BiLSTM) networks, and proposes a semi-supervised text classification framework that enables the efficient and accurate selection of the relevant articles. Specifically, text scoring and BiLSTM-based active learning were used to replace the title/abstract screening and full-text screening, respectively, which greatly reduces the human workload. In this study, 101 relevant articles were selected from 4 bibliographic databases, and a catalogue of essential dengue landscape factors was identified and divided into four categories: land use (LU), land cover (LC), topography and continuous land surface features. Moreover, various satellite EO sensors and products used for identifying landscape factors were tabulated. Finally, possible future directions of applying satellite EO data in dengue research in terms of landscape patterns, satellite sensors and deep learning were proposed. The proposed semi-supervised text classification framework was successfully applied in research evidence synthesis that could be easily applied to other topics, particularly in an interdisciplinary context.

Read full abstract

Semi-supervised Text Classification Research Articles

Related Topics

Articles published on Semi-supervised Text Classification

Improving Semi-Supervised Text Classification with Dual Meta-Learning

Towards Robust Learning with Noisy and Pseudo Labels for Text Classification

CDGAN-BERT: Adversarial constraint and diversity discriminator for semi-supervised text classification

Rank-Aware Negative Training for Semi-Supervised Text Classification

Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification

An embedding-based text classification approach for understanding micro-geographic housing dynamics

Multi-MCCR: Multiple models regularization for semi-supervised text classification with few labels

A multi-semantic passing framework for semi-supervised long text classification

Accelerating Semi-Supervised Text Classification by K-Way Projecting Networks

LMGAN: Linguistically Informed Semi-Supervised GAN with Multiple Generators.

Ptr4BERT: Automatic Semisupervised Chinese Government Message Text Classification Method Based on Transformer-Based Pointer Generator Network

Self-training method based on GCN for semi-supervised short text classification

Contrast-Enhanced Semi-supervised Text Classification with Few Labels

Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification

Graph Convolutional Network Based on Multi-Head Pooling for Short Text Classification

SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction

HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

Semi-Supervised Text Classification Framework: An Overview of Dengue Landscape Factors and Satellite Earth Observation.

Adversarial Dropout for Recurrent Neural Networks

Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Semi-supervised Text Classification Research Articles

Related Topics

Articles published on Semi-supervised Text Classification

Improving Semi-Supervised Text Classification with Dual Meta-Learning

Towards Robust Learning with Noisy and Pseudo Labels for Text Classification

CDGAN-BERT: Adversarial constraint and diversity discriminator for semi-supervised text classification

Rank-Aware Negative Training for Semi-Supervised Text Classification

Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification

An embedding-based text classification approach for understanding micro-geographic housing dynamics

Multi-MCCR: Multiple models regularization for semi-supervised text classification with few labels

A multi-semantic passing framework for semi-supervised long text classification

Accelerating Semi-Supervised Text Classification by K-Way Projecting Networks

LMGAN: Linguistically Informed Semi-Supervised GAN with Multiple Generators.

Ptr4BERT: Automatic Semisupervised Chinese Government Message Text Classification Method Based on Transformer-Based Pointer Generator Network

Self-training method based on GCN for semi-supervised short text classification

Contrast-Enhanced Semi-supervised Text Classification with Few Labels

Generative Adversarial learning with Negative Data Augmentation for Semi-supervised Text Classification

Graph Convolutional Network Based on Multi-Head Pooling for Short Text Classification

SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction

HGAT: Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification

Semi-Supervised Text Classification Framework: An Overview of Dengue Landscape Factors and Satellite Earth Observation.

Adversarial Dropout for Recurrent Neural Networks

Revisiting LSTM Networks for Semi-Supervised Text Classification via Mixed Objective Function