A multi-average based pseudo nearest neighbor classifier

  • Abstract
  • References
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Conventional k nearest neighbor (KNN) rule is a simple yet effective method for classification, but its classification performance is easily degraded in the case of small size training samples with existing outliers. To address this issue, A multi-average based pseudo nearest neighbor classifier (MAPNN) rule is proposed. In the proposed MAPNN rule, k ( k − 1 ) / 2 ( k > 1) local mean vectors of each class are obtained by taking the average of two points randomly from k nearest neighbors in every category, and then k pseudo nearest neighbors are chosen from k ( k − 1 ) / 2 local mean neighbors of every class to determine the category of a query point. The selected k pseudo nearest neighbors can reduce the negative impact of outliers in some degree. Extensive experiments are carried out on twenty-one numerical real data sets and four artificial data sets by comparing MAPNN to other five KNN-based methods. The experimental results demonstrate that the proposed MAPNN is effective for classification task and achieves better classification results in the small-size samples cases comparing to five relative KNN-based classifiers.

ReferencesShowing 10 of 20 papers
  • Cite Count Icon 77
  • 10.1016/j.knosys.2019.01.016
Locality constrained representation-based K-nearest neighbor classification
  • Jan 14, 2019
  • Knowledge-Based Systems
  • Jianping Gou + 5 more

  • Cite Count Icon 67
  • 10.1016/j.eswa.2016.09.031
A new k-harmonic nearest neighbor classifier based on the multi-local means
  • Sep 20, 2016
  • Expert Systems with Applications
  • Zhibin Pan + 2 more

  • Cite Count Icon 4684
  • 10.1016/j.swevo.2011.02.002
A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms
  • Mar 1, 2011
  • Swarm and Evolutionary Computation
  • Joaquín Derrac + 3 more

  • Cite Count Icon 53
  • 10.1016/0031-3203(80)90012-6
On the dominance of non-parametric Bayes rule discriminant algorithms in high dimensions
  • Jan 1, 1980
  • Pattern Recognition
  • John Van Ness

  • Cite Count Icon 777
  • 10.1016/b978-0-08-047865-4.50007-7
Chapter 1 - INTRODUCTION
  • Jan 1, 1990
  • Introduction to Statistical Pattern Recognition
  • Keinosuke Fukunaga

  • Cite Count Icon 94
  • 10.1109/tsmcb.2007.908363
The Nearest Neighbor Algorithm of Local Probability Centers
  • Feb 1, 2008
  • IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
  • Boyu Li + 2 more

  • Open Access Icon
  • Cite Count Icon 4819
  • 10.1007/s10115-007-0114-2
Top 10 algorithms in data mining
  • Dec 4, 2007
  • Knowledge and Information Systems
  • Xindong Wu + 13 more

  • Cite Count Icon 30
  • 10.1016/j.eswa.2008.10.041
Nonparametric classification based on local mean and class statistics
  • Nov 3, 2008
  • Expert Systems with Applications
  • Yong Zeng + 2 more

  • Cite Count Icon 179
  • 10.1016/j.patrec.2005.12.016
A local mean-based nonparametric classifier
  • Feb 23, 2006
  • Pattern Recognition Letters
  • Y Mitani + 1 more

  • Cite Count Icon 14
  • 10.1016/j.eswa.2022.117159
Attention-based Local Mean [formula omitted]-Nearest Centroid Neighbor Classifier
  • Apr 12, 2022
  • Expert Systems with Applications
  • Ying Ma + 4 more

Similar Papers
  • Research Article
  • Cite Count Icon 67
  • 10.1016/j.knosys.2014.07.020
Improved pseudo nearest neighbor classification
  • Jul 31, 2014
  • Knowledge-Based Systems
  • Jianping Gou + 5 more

Improved pseudo nearest neighbor classification

  • Research Article
  • 10.1504/ijcse.2018.10015540
A pseudo nearest centroid neighbour classifier
  • Jan 1, 2018
  • International Journal of Computational Science and Engineering
  • Xili Wang + 2 more

In this paper, we propose a new reliable classification approach, called the pseudo nearest centroid neighbour rule, which is based on the pseudo nearest neighbour rule (PNN) and nearest centroid neighbourhood (NCN). In the proposed PNCN, the nearest centroid neighbours rather than nearest neighbours per class are first searched by means of NCN. Then, we calculate k categorical local mean vectors corresponding to k nearest centroid neighbours, and assign a weight to each local mean vector. Using the weighted k local mean vectors for each class, PNCN designs the corresponding pseudo nearest centroid neighbour and decides the class label of the query pattern according to the closest pseudo nearest centroid neighbour among all classes. The classification performance of the proposed PNCN is evaluated on real and artificial datasets in terms of the classification accuracy. The experimental results demonstrate the effectiveness and robustness of PNCN over the competing methods in many practical classification problems.

  • Research Article
  • Cite Count Icon 279
  • 10.1016/j.eswa.2018.08.021
A generalized mean distance-based k-nearest neighbor classifier
  • Aug 13, 2018
  • Expert Systems with Applications
  • Jianping Gou + 5 more

A generalized mean distance-based k-nearest neighbor classifier

  • Research Article
  • Cite Count Icon 67
  • 10.1016/j.eswa.2016.09.031
A new k-harmonic nearest neighbor classifier based on the multi-local means
  • Sep 20, 2016
  • Expert Systems with Applications
  • Zhibin Pan + 2 more

A new k-harmonic nearest neighbor classifier based on the multi-local means

  • Research Article
  • Cite Count Icon 64
  • 10.1016/j.knosys.2021.107604
A new two-layer nearest neighbor selection method for kNN classifier
  • Oct 19, 2021
  • Knowledge-Based Systems
  • Yikun Wang + 2 more

A new two-layer nearest neighbor selection method for kNN classifier

  • Book Chapter
  • Cite Count Icon 5
  • 10.1007/978-981-10-0539-8_12
Pseudo Nearest Centroid Neighbor Classification
  • Jan 1, 2016
  • Hongxing Ma + 2 more

In this paper, we propose a new reliable classification approach, called the pseudo nearest centroid neighbor rule, which is based on the pseudo nearest neighbor rule (PNN) and nearest centroid neighborhood (NCN). In the proposed PNCN, the nearest centroid neighbors rather than nearest neighbors per class are first searched by means of NCN. Then, we calculate k categorical local mean vectors corresponding to k nearest centroid neighbors, and assign the weight to each local mean vector. Using the weighted k local mean vectors for each class, PNCN designs the corresponding pseudo nearest centroid neighbor and decides the class label of the query pattern according to the closest pseudo nearest centroid neighbor among all classes. The classification performance of the proposed PNCN is evaluated on real data sets in terms of the classification accuracy. The experimental results demonstrate the effectiveness of PNCN over the competing methods in many practical classification problems.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3457784.3457828
Local Mean k-General Nearest Neighbor Classifier
  • Feb 23, 2021
  • Nordiana Mukahar + 1 more

The well-known k-Nearest Neighbor classifier is a simple and flexible algorithm that has sparked wide interest in pattern classification. In spite of its straightforward implementation, the kNN is sensitive to the presence of noisy training samples and variance of the distribution. Local mean based k-nearest neighbor rule has been developed to overcome the negative effect of the noisy training sample. In this article, the local mean rule is implemented with the general nearest neighbors that are selected in a more generalized way. A new local mean based nearest neighbor classifier is proposed termed Local Mean k-General Nearest Neighbor (LMkGNN). The proposed LMkGNN classifier finds the local mean vector from general nearest neighbors of each class and classifies the test sample based on the distances between the test sample and local mean vectors. Fifteen real-world datasets from the UCI machine learning repository are used to assess and evaluate the classification performance of the proposed classifier. The performance comparison is also made with five benchmark classifiers (kNN, PNN, LMkNN, LMPNN and kGNN) in terms of the classification accuracy. Experimental results demonstrate that the proposed LMkGNN classifier performs significantly well and obtain the best classification accuracy com-pared to the five competing classifiers.

  • Research Article
  • Cite Count Icon 14
  • 10.1016/j.eswa.2022.117159
Attention-based Local Mean [formula omitted]-Nearest Centroid Neighbor Classifier
  • Apr 12, 2022
  • Expert Systems with Applications
  • Ying Ma + 4 more

Attention-based Local Mean [formula omitted]-Nearest Centroid Neighbor Classifier

  • Research Article
  • Cite Count Icon 50
  • 10.1109/tnnls.2019.2920864
A Training Data Set Cleaning Method by Classification Ability Ranking for the k -Nearest Neighbor Classifier.
  • Jun 28, 2019
  • IEEE Transactions on Neural Networks and Learning Systems
  • Yidi Wang + 2 more

The k -nearest neighbor (KNN) rule is a successful technique in pattern classification due to its simplicity and effectiveness. As a supervised classifier, KNN classification performance usually suffers from low-quality samples in the training data set. Thus, training data set cleaning (TDC) methods are needed for enhancing the classification accuracy by cleaning out noisy, or even wrong, samples in the original training data set. In this paper, we propose a classification ability ranking (CAR)-based TDC method to improve the performance of a KNN classifier, namely CAR-based TDC method. The proposed classification ability function ranks a training sample in terms of its contribution to correctly classify other training samples as a KNN through the leave-one-out (LV1) strategy in the cleaning stage. The training sample that likely misclassifies the other samples during the KNN classifications according to the LV1 strategy is considered to have lower classification ability and will be cleaned out from the original training data set. Extensive experiments, based on ten real-world data sets, show that the proposed CAR-based TDC method can significantly reduce the classification error rates of KNN-based classifiers, while reducing computational complexity thanks to a smaller cleaned training data set.

  • Research Article
  • Cite Count Icon 21
  • 10.1007/s00500-020-05311-x
A new globally adaptive k-nearest neighbor classifier based on local mean optimization
  • Oct 3, 2020
  • Soft Computing
  • Zhibin Pan + 3 more

The k-nearest neighbor (KNN) rule is a simple and effective nonparametric classification algorithm in pattern classification. However, it suffers from several problems such as sensitivity to outliers and inaccurate classification decision rule. Thus, a local mean-based k-nearest neighbor classifier (LMKNN) was proposed to address these problems, which assigns the query sample with a class label based on the closest local mean vector among all classes. It is proven that the LMKNN classifier achieves better classification performance and is more robust to outliers than the classical KNN classifier. Nonetheless, the unreliable nearest neighbor selection rule and single local mean vector strategy in LMKNN classifier severely have negative effect on its classification performance. Considering these problems in LMKNN, we propose a globally adaptive k-nearest neighbor classifier based on local mean optimization, which utilizes the globally adaptive nearest neighbor selection strategy and the implementation of local mean optimization to obtain more convincing and reliable local mean vectors. The corresponding experimental results conducted on twenty real-world datasets demonstrated that the proposed classifier achieves better classification performance and is less sensitive to the neighborhood size $$k$$ compared with other improved KNN-based classification methods.

  • Research Article
  • 10.1088/1742-6596/890/1/012070
Frog sound identification using extended k-nearest neighbor classifier
  • Sep 1, 2017
  • Journal of Physics: Conference Series
  • Nordiana Mukahar + 3 more

Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.

  • Research Article
  • Cite Count Icon 86
  • 10.1093/comjnl/bxr131
A Local Mean-Based k-Nearest Centroid Neighbor Classifier
  • Jan 5, 2012
  • The Computer Journal
  • J Gou + 3 more

K-nearest neighbor (KNN) rule is a simple and effective algorithm in pattern classification. In this article, we propose a local mean-based k-nearest centroid neighbor classifier that assigns to each query pattern a class label with nearest local centroid mean vector so as to improve the classification performance. The proposed scheme not only takes into account the proximity and spatial distribution of k neighbors, but also utilizes the local mean vector of k neighbors from each class in making classification decision. In the proposed classifier, a local mean vector of k nearest centroid neighbors from each class for a query pattern is well positioned to sufficiently capture the class distribution information. In order to investigate the classification behavior of the proposed classifier, we conduct extensive experiments on the real and synthetic data sets in terms of the classification error. Experimental results demonstrate that our proposed method performs significantly well, particularly in the small sample size cases, compared with the state-of-the-art KNN-based algorithms.

  • Research Article
  • 10.3390/math10152743
An Ensemble and Iterative Recovery Strategy Based kGNN Method to Edit Data with Label Noise
  • Aug 3, 2022
  • Mathematics
  • Baiyun Chen + 3 more

Learning label noise is gaining increasing attention from a variety of disciplines, particularly in supervised machine learning for classification tasks. The k nearest neighbors (kNN) classifier is often used as a natural way to edit the training sets due to its sensitivity to label noise. However, the kNN-based editor may remove too many instances if not designed to take care of the label noise. In addition, the one-sided nearest neighbor (NN) rule is unconvincing, as it just considers the nearest neighbors from the perspective of the query sample. In this paper, we propose an ensemble and iterative recovery strategy-based kGNN method (EIRS-kGNN) to edit data with label noise. EIRS-kGNN first uses the general nearest neighbors (GNN) to expand the one-sided NN rule to a binary-sided NN rule, taking the neighborhood of the queried samples into account. Then, it ensembles the prediction results of a finite set of ks in the kGNN to prudently judge the noise levels for each sample. Finally, two loops, i.e., the inner loop and the outer loop, are leveraged to iteratively detect label noise. A frequency indicator is derived from the iterative processes to guide the mixture approaches, including relabeling and removing, to deal with the detected label noise. The goal of EIRS-kGNN is to recover the distribution of the data set as if it were not corrupted. Experimental results on both synthetic data sets and UCI benchmarks, including binary data sets and multi-class data sets, demonstrate the effectiveness of the proposed EIRS-kGNN method.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 38
  • 10.1109/access.2020.2977421
Fault Detection in the Tennessee Eastman Benchmark Process Using Principal Component Difference Based on K-Nearest Neighbors
  • Jan 1, 2020
  • IEEE Access
  • Cheng Zhang + 2 more

Industrial data usually have nonlinear or multimodal characteristics which do not meet the data assumptions of statistics in principal component analysis (PCA). Therefore, PCA has a lower fault detection rate in industrial processes. Aiming at the above limitations of PCA, a fault detection method using principal component difference based on k-nearest neighbors (Diff-PCA) is proposed in this paper. First, find the k nearest neighbors set of each sample in the training data set and calculate its mean vector. Second, build an augmented vector using each sample and its corresponding mean vector. Third, calculate the loading matrix and score matrix using PCA. Next, calculate the estimated scores using the mean vector of each sample and missing data imputation technique for PCA. At last, build two new statistics using the difference between the real scores and estimated scores to detect faults. In addition, the fault diagnosis method based on contribution plots of monitored variables is also proposed in this paper. In Diff-PCA, the difference skill can eliminate the impact of the nonlinear and multimodal structure on fault detection. Meanwhile, the monitored subspaces by the two new statistics are different from that by T2 and SPE in PCA. The efficiency of the proposed strategy is implemented in two numerical cases (nonlinear and multimode) and the Tennessee Eastman (TE) processes. The fault detection results indicate that Diff-PCA outperforms the conventional PCA, Kernel PCA, dynamic PCA, principal component-based k nearest neighbor rule and k nearest neighbor rule.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/spac.2017.8304246
A new nearest neighbor classifier based on multi-harmonic mean distances
  • Dec 1, 2017
  • Hongxing Ma + 5 more

K-nearest neighbor (KNN) rule is a simple and effective classifier in pattern recognition. In this paper, we propose a new nearest neighbor classifier based on multi-harmonic mean distances, in order to overcome the sensitivity of the neighborhood size k and improve the classification performance. The proposed method is called a harmonic mean distance-based k-nearest neighbor classifier (HMDKNN). It mainly designs the multi-harmonic mean distances based on the multi-local mean vectors calculated by utilizing k nearest neighbors of the given query sample in each class. Using the multi-harmonic mean distances per class, a new nested harmonic mean distance in each class is designed as the classification decision and the query sample is classified into the class with the closest nested harmonic mean distance among all classes. The experimental results on the UCI data sets show that the proposed HMDKNN performs better with the less sensitiveness to k, compared to the state-of-art KNN-based methods.

More from: AI Communications
  • Research Article
  • 10.3233/aic-230270
Open-world object detection: A solution based on reselection mechanism and feature disentanglement
  • Sep 18, 2024
  • AI Communications
  • Tian Lin + 3 more

  • Research Article
  • Cite Count Icon 1
  • 10.3233/aic-230434
A diversity-aware recommendation system for tutoring
  • Sep 18, 2024
  • AI Communications
  • Laura Achón + 3 more

  • Open Access Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3233/aic-230325
The CADE-29 Automated Theorem Proving System Competition – CASC-29
  • Sep 18, 2024
  • AI Communications
  • Geoff Sutcliffe + 1 more

  • Research Article
  • 10.3233/aic-230312
A multi-average based pseudo nearest neighbor classifier
  • Sep 18, 2024
  • AI Communications
  • Dapeng Li + 1 more

  • Research Article
  • 10.3233/aic-230053
Spatio-temporal deep learning framework for pedestrian intention prediction in urban traffic scenes
  • Sep 18, 2024
  • AI Communications
  • Monika + 2 more

  • Research Article
  • Cite Count Icon 4
  • 10.3233/aic-220247
Multimodal biometric authentication: A review
  • Sep 18, 2024
  • AI Communications
  • Swimpy Pahuja + 1 more

  • Research Article
  • Cite Count Icon 2
  • 10.3233/aic-230340
Residual SwinV2 transformer coordinate attention network for image super resolution
  • Sep 18, 2024
  • AI Communications
  • Yushi Lei + 4 more

  • Research Article
  • 10.3233/aic-230227
Multi-feature fusion dehazing based on CycleGAN
  • Sep 18, 2024
  • AI Communications
  • Jingpin Wang + 3 more

  • Research Article
  • 10.3233/aic-230154
Considerations on sentiment of social network posts as a feature of destructive impacts
  • Sep 18, 2024
  • AI Communications
  • Diana Levshun + 4 more

  • Research Article
  • Cite Count Icon 1
  • 10.3233/aic-230217
Second-order Spatial Measures Low Overlap Rate Point Cloud Registration Algorithm Based On FPFH Features1
  • Sep 18, 2024
  • AI Communications
  • Zewei Lian + 4 more

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon