Abstract

The detection of anomalies in data presents a significant challenge in various applications. The Isolation Forest (IF) has gained attention due to its notable performance features, including high accuracy, efficiency, simplicity, and rapid computation. However, prior research has primarily concentrated on the constructing isolation trees (iTrees), overlooking the considerable influence of anomaly scoring methods on anomaly detection performance. This study introduces an innovative anomaly scoring method that integrates fuzzy concepts to enhance detection performance. Fuzzy concepts adeptly manage ambiguity and uncertainty, making them more readily applicable in anomaly scoring than in iTree training. Unlike conventional methods that assign a sample to a single child node in the decision path, the proposed fuzzy anomaly scoring method allows samples to be assigned to all child nodes with varying membership degrees based on the target sample. Consequently, this method aggregates the path lengths of all external nodes in a weighted manner, minimizing the impact of irrelevant splits on anomaly scores. Considering that even IF algorithms using informative splits select splits based on data distribution rather than label information, introducing fuzziness to the splits themselves can effectively mitigate performance degradation caused by irrelevant splits. Extensive experiments on 25 benchmark datasets demonstrated that the proposed anomaly scoring method significantly improved both the performance and stability of anomaly detection with the base IF algorithm, outperforming other IF algorithms and fuzzy rough set-based anomaly detection methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.