Dealing with Inliers in Feature Vector Data

Dheeraj Kumar,Kotagiri Ramamohanarao,Zahra Ghafoori,Christopher Leckie,James C Bezdek,Marimuthu Palaniswami

doi:10.1142/s021848851840010x

Abstract

Inliers (bridge points) between clusters degrade the ability of many algorithms to find clusters in numerical data. We present three new approaches to the detection and removal of inliers. Two approaches are based on Local Outlier Factor (LOF) scores. We also discuss using LOF scores for an isolation Nearest Neighbour Ensemble (iNNE) approach to inlier detection. The third approach uses MaxiMin (MM) sampling to remove both inliers and outliers. We compare the three approaches on a synthetic and two real-life datasets. The failure of single linkage clustering due to the existence of bridging points is used as a means for evaluating the relative effectiveness of the three methods. We also show how inliers can degrade the quality of images built by the improved Visual Assessment of Tendency (iVAT) algorithm, which provides a visual representation of potential single linkage clusters in the data.

Full Text