A self-training algorithm based on the two-stage data editing method with mass-based dissimilarity

Jikui Wang,Yiwen Wu,Shaobo Li,Feiping Nie

doi:10.1016/j.neunet.2023.09.046

Abstract

A self-training algorithm is a classical semi-supervised learning algorithm that uses a small number of labeled samples and a large number of unlabeled samples to train a classifier. However, the existing self-training algorithms consider only the geometric distance between data while ignoring the data distribution when calculating the similarity between samples. In addition, misclassified samples can severely affect the performance of a self-training algorithm. To address the above two problems, this paper proposes a self-training algorithm based on data editing with mass-based dissimilarity (STDEMB). First, the mass matrix with the mass-based dissimilarity is obtained, and then the mass-based local density of each sample is determined based on its k nearest neighbors. Inspired by density peak clustering (DPC), this study designs a prototype tree based on the prototype concept. In addition, an efficient two-stage data editing algorithm is developed to edit misclassified samples and efficiently select high-confidence samples during the self-training process. The proposed STDEMB algorithm is verified by experiments using accuracy and F-score as evaluation metrics. The experimental results on 18 benchmark datasets demonstrate the effectiveness of the proposed STDEMB algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A self-training algorithm based on the two-stage data editing method with mass-based dissimilarity

Abstract

Talk to us

Similar Papers

More From: Neural Networks

Lead the way for us

Journal: Neural Networks	Publication Date: Sep 29, 2023
Citations: 1

Similar Papers

On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification
Isaac Triguero ... Francisco Herrera
Neurocomputing | VOL. 132
Isaac Triguero, et. al.Isaac Triguero ... Francisco Herrera
12 Nov 2013
Neurocomputing | VOL. 132

Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy
Lin Sun ... Jiucheng Xu
Neurocomputing | VOL. 473
Lin Sun, et. al.Lin Sun ... Jiucheng Xu
10 Dec 2021
Neurocomputing | VOL. 473

Self-training algorithm based on density peaks combining globally adaptive multi-local noise filter
Shuaijun Li ... Jia Lu
Intelligent Data Analysis | VOL. 27
Shuaijun Li, et. al.Shuaijun Li ... Jia Lu
15 Mar 2023
Intelligent Data Analysis | VOL. 27

Density peaks clustering by granular computing with label propagation
Yan Li ... Lingyun Sun
-
Yan Li, et. al.Yan Li ... Lingyun Sun
01 Dec 2022
01 Dec 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A self-training algorithm based on the two-stage data editing method with mass-based dissimilarity

Abstract

Talk to us

Similar Papers

More From: Neural Networks