Outlier detection method based on high-density iteration

Yu Zhou,Hao Xia,Dahui Yu,Jiaoyang Cheng,Jichun Li

doi:10.1016/j.ins.2024.120286

Abstract

In conventional outlier detection, global outliers are easily identified, but the efficacy diminishes when faced with local outliers within clusters of varying densities. Conversely, while the local outlier factor excels in detecting local anomalies, its performance falters as the number of outliers increases. To address these limitations and cater to intricate datasets by ensuring adept detection of both global and local outliers, this paper introduces a novel outlier detection approach known as High-Density Iteration (HDIOD). The methodology begins by leveraging a combination of the Gaussian kernel function and k-nearest neighbors to compute the local kernel density for each sample. Subsequently, the process involves comparing the local kernel density of a given sample with that of its k-neighbors. If the sample's local kernel density is lower than the maximum density among its neighbors, it selects the neighbor with the highest local kernel density within its k-neighbors as the new object for comparison. This iterative process continues, where the set of k-neighbors for all objects constitutes the extended k-neighbors of the original sample. The final step involves utilizing the ratio of the maximum local kernel density within the extended k-nearest neighbors to the local density of the sample as a measure of the sample's outlier degree. Experimental evaluations conducted on 12 synthetic datasets and 19 real-world datasets demonstrate the effectiveness of the HDIOD method. Comparative analyses with 13 commonly used outlier detection methods underscore the high detection accuracy and robustness of HDIOD to parameter variations.

Full Text