Abstract

Most unsupervised outlier detection methods from the global perspective cannot effectively detect local outliers when normal samples present clusters with different densities. Moreover, existing local outlier detectors are not sensitive to clustered outliers. This paper proposes an unsupervised outlier detection method with differential potential spread loss (DPSL) to simultaneously detect global, local, and clustered outliers. DPSL first subsamples the original dataset several times. For each subset, the samples are sequentially linked in reverse to construct the first-level potential chains according to their nearest neighbor relationship, whose starting points are called first-level potential peak points. Secondly, the lower-level potential peak points are extracted to build higher-level potential chains, and this process is performed iteratively until the global potential peak point is obtained. The potential spread loss of a sample, which measures its local anomaly degree, is defined based on the distance and the ratio of isolation radius between the sample and its potential peak point on the same potential chain. The potential circles are established according to the levels of potential peak points and determine the global anomaly degree of the samples. Finally, DPSL calculates the outlier scores of all samples by combining the potential spread losses and potential circle levels of the subset samples and integrates the results obtained by every subsampling. DPSL outperforms 11 typical unsupervised outlier detection methods on 12 artificial datasets with four types of clustered outlier distributions. Meanwhile, it performs well on other 12 artificial datasets with various local outliers and 35 real-world datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.