NearCount: Selecting critical instances based on the cited counts of nearest neighbors

Zonghai Zhu,Zhe Wang,Dongdong Li,Wenli Du

doi:10.1016/j.knosys.2019.105196

Abstract

Traditional instance selection algorithms are not good at addressing imbalanced problems. Moreover, most of them are sensitive to noise instances and suffer from complex selection rules. To solve these problems, in this paper, we propose a concise learning framework named NearCount to determine the importance of the instance without editing noise. In NearCount, the importance of an instance corresponds to the cited counts. The count is determined by the number of times that one instance is selected as a nearest neighbor of instances in different classes. For the instances with nonzero cited counts, the importance of the instance is inversely proportional to the cited count. To handle classification problems with different data distributions, two detailed NearCount-based algorithms – NearCount-IM and NearCount-IS – are introduced. For imbalanced problems, NearCount-IM selects the important majority instances with an equal number of minority instances, thus balancing the data distribution. For balanced scenarios, NearCount-IS selects the instances whose cited counts are greater than zero and equal or less than the number of nearest neighbors as critical instances in every class. The proposed NearCount-IM and NearCount-IS algorithms are evaluated by comparing them with classical instance selection algorithms on the benchmark data sets. Experiments validate the effectiveness of the proposed algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

NearCount: Selecting critical instances based on the cited counts of nearest neighbors

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Journal: Knowledge-Based Systems	Publication Date: Nov 7, 2019
Citations: 12

Similar Papers

A weighted hybrid ensemble method for classifying imbalanced data
Jiakun Zhao ... Qingfang Liu
Knowledge-Based Systems | VOL. 203
Jiakun Zhao, et. al.Jiakun Zhao ... Qingfang Liu
08 Jun 2020
Knowledge-Based Systems | VOL. 203

A novel version of k nearest neighbor: Dependent nearest neighbor
Ömer Faruk Ertuğrul ... Mehmet Emin Tağluk
Applied Soft Computing | VOL. 55
Ömer Faruk Ertuğrul, et. al.Ömer Faruk Ertuğrul ... Mehmet Emin Tağluk
24 Feb 2017
Applied Soft Computing | VOL. 55

Diversity exploration and negative correlation learning on imbalanced data sets
Shuo Wang ... Ke Tang
-
Shuo Wang, et. al.Shuo Wang ... Ke Tang
01 Jun 2009
01 Jun 2009

Application of Genetic Algorithm and K-Nearest Neighbour Method in Real World Medical Fraud Detection Problem
Hongxing He ... Xin Yao
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 4
Hongxing He, et. al.Hongxing He ... Xin Yao
20 Mar 2000
Journal of Advanced Computational Intelligence and Intelligent Informatics | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

NearCount: Selecting critical instances based on the cited counts of nearest neighbors

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems