Abstract

We propose Finite Boundary (FB) version of Isolation Forest (IF), Split Selection Criterion iForest (SciForest) and Extended Isolation Forest (EIF) algorithms using hypersphere as branching boundary for enhanced consistency in anomaly score. EIF substitutes axis parallel hyperplanes with slanted hyperplanes as in SciForest for a remedy of the problem of inconsistent anomaly score. EIF offers an improvement of computation speed over SciForest algorithm by removing the search for the optimum hyperplane for branching. We identify inconsistency in anomaly score by EIF for a synthetic 2-D spiral dataset and inconsistency in anomaly score for single blob of 2-D synthetic gaussian dataset by SciForest to empirically show that the slanted hyperplanes alone is insufficient. First, we explain the abnormal decrease of anomaly score for anomalous data points due to the unexpected increase in the number of branching for anomalous data points by the infinite extensions of hyperplanes. Second, we propose to use hyper-sphere as a suitable option for generalized branching decision boundary. Next, we empirically show that the anomaly scores suffer not from the artifacts of axis parallelism of the hyper-planes of IF, by comparing anomaly scores with finite boundary hyper-sphere as branching decision boundary against the slanted hyperplanes and highlight the redundant extension of the infinite hyperplanes as the dominant cause of the inconsistency in anomaly score. Third, we apply FB version of IF (FBIF), EIF (FBEIF) and SciForest (FBSciForest) to several standard 2-D synthetic datasets to assess robustness and computation speed in comparison to EIF, SciForest and IF.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call