Semi-supervised multi-label classification using an extended graph-based manifold regularization

Ding Li,Scott Dick

doi:10.1007/s40747-021-00611-7

Abstract

Graph-based algorithms are known to be effective approaches to semi-supervised learning. However, there has been relatively little work on extending these algorithms to the multi-label classification case. We derive an extension of the Manifold Regularization algorithm to multi-label classification, which is significantly simpler than the general Vector Manifold Regularization approach. We then augment our algorithm with a weighting strategy to allow differential influence on a model between instances having ground-truth vs. induced labels. Experiments on four benchmark multi-label data sets show that the resulting algorithm performs better overall compared to the existing semi-supervised multi-label classification algorithms at various levels of label sparsity. Comparisons with state-of-the-art supervised multi-label approaches (which of course are fully labeled) also show that our algorithm outperforms all of them even with a substantial number of unlabeled examples.

Highlights

In many real-world applications, such as bioinformatics and video annotation, obtaining labeled data is sometimes very difficult, expensive and time-consuming
This paper studies the semi-supervised multi-label classification problem, and extends the graph-based manifold regularization to the multi-label case
Extensive experiments are conducted on four public data sets with different categories to test the performances of the proposed Multi-Label Manifold Regularization (ML-MR), both with and without the Reliance Weighting (RW) strategy

Summary

Introduction

In many real-world applications, such as bioinformatics and video annotation, obtaining labeled data is sometimes very difficult, expensive and time-consuming. We investigate a multi-label extension of the Manifold Regularization (MR) algorithm, augmented with a reliance weighting strategy to further improve classification performance. Seven algorithms are carried out for comparisons: (1) the Multi-Label k Nearest Neighbors (MLkNN) [57], (2) the Multi-Label Gaussian Fields and Harmonic Functions (ML-GFHF) [56], (3) the Multi-Label Local and Global Consistency (ML-LGC) [56], (4) the Fixed-Size Multi-Label Regularized Kernel Spectral Clustering (MLFSKSC) [33], (5) the Semi-Supervised Weak-Label approach (SSWL) [18], (6) the Multi-Label Manifold Regularization (ML-MR), and (7) the ML-MR with the Reliance Weighting strategy (ML-MRRW) in “Reliance weighted kernel for performance improvement”.

Results

Conclusion