Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme

Nikos Fazakis,Christos K Aridas,Stamatis Karlos,Vasileios G Kanas,Sotiris Kotsiantis

doi:10.3390/e21100988

Nikos Fazakis, Christos K Aridas + Show 3 more

Open Access

https://doi.org/10.3390/e21100988

Copy DOI

Journal: Entropy	Publication Date: Oct 10, 2019
Citations: 16	License type: CC BY 4.0

Affiliation: University of Patras

Abstract

One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming since it requires the employment of human expertise. For a wide variety of scientific fields, unlabeled examples are easy to collect but hard to handle in a useful manner, thus improving the contained information for a subject dataset. In this context, a variety of learning methods have been studied in the literature aiming to efficiently utilize the vast amounts of unlabeled data during the learning process. The most common approaches tackle problems of this kind by individually applying active learning or semi-supervised learning methods. In this work, a combination of active learning and semi-supervised learning methods is proposed, under a common self-training scheme, in order to efficiently utilize the available unlabeled data. The effective and robust metrics of the entropy and the distribution of probabilities of the unlabeled set, to select the most sufficient unlabeled examples for the augmentation of the initial labeled set, are used. The superiority of the proposed scheme is validated by comparing it against the base approaches of supervised, semi-supervised, and active learning in the wide range of fifty-five benchmark datasets.

Highlights

The most common approach established in machine learning (ML) is supervised learning (SL).Under the SL schemes, classifiers are trained using purely labeled data
Fifty-five (55) benchmark datasets were extracted from the UCI repository [14], related to a wide range of classification problems
The k parameter was set equal to ten, as it is commonly selected by the majority of the literature

Summary

Introduction

Under the SL schemes, classifiers are trained using purely labeled data. In contrast with the problem complexity, the performance of such schemes is directly analogous to the amount and the quality of labeled data which are used at the training phase. Many research works [7] exist focusing on techniques with the aim of exploiting the available unlabeled data especially in favor of classification problems. The most common learning methods incorporating such techniques are active learning (AL) and semi-supervised learning (SSL) [8]. Both AL and SSL share an iterative learning nature, making them a perfect fit for constructing more complex combination learning schemes

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

Combining active and semi-supervised learning for spoken language understanding
Gokhan Tur ... Robert E Schapire
Speech Communication | VOL. 45
Gokhan Tur, et. al.Gokhan Tur ... Robert E Schapire
30 Oct 2004
Speech Communication | VOL. 45

A novel logistic regression model combining semi-supervised learning and active learning for disease classification
Hua Chai ... Yong Liang
Scientific Reports | VOL. 8
Hua Chai, et. al.Hua Chai ... Yong Liang
29 Aug 2018
Scientific Reports | VOL. 8

Active Learning for Multivariate Time Series Classification with Positive Unlabeled Data
Guoliang He ... Yifei Li
-
Guoliang He, et. al.Guoliang He ... Yifei Li
01 Nov 2015
01 Nov 2015

BATCH MODE ACTIVE LEARNING FOR GRAPH-BASED SEMI-SUPERVISED LEARNING
Cheong Hee Park
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 27
Cheong Hee ParkCheong Hee Park
01 Nov 2013
International Journal of Pattern Recognition and Artificial Intelligence | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy