Abstract

Recent research studies on outlier detection have focused on examining the nearest neighbor structure of a data object to measure its outlierness degree. This leads to two weaknesses: the size of nearest neighborhood, which should be predetermined, greatly affects the final detection results, and the outlierness scores produced by existing methods are not sufficiently diverse to allow precise ranking of outliers. To overcome these problems, in this research paper, a novel outlier detection method involving an iterative random sampling procedure is proposed. The proposed method is inspired by the simple notion that outlying objects are less easily selected than inlying objects in blind random sampling, and therefore, more inlierness scores are given to selected objects. We develop a new measure called the observability factor (OF) by utilizing this idea. In order to offer a heuristic guideline to determine the best size of nearest neighborhood, we additionally propose using the entropy of OF scores. An intensive numerical evaluation based on various synthetic and real-world datasets shows the superiority and effectiveness of the proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.