Abstract

We show superior performance of Isolation Forest (IF) algorithm with Finite Boundary (FB) or Finite Boundary Isolation Forest (FBIF) algorithm compared to IF and Extended Isolation Forest (EIF), when the dataset consists of continuous variables that fit standard distribution properties. The experiments performed on a public dataset show distinct advantage by FBIF over IF and EIF. EIF demonstrates better performance than IF on the same public dataset consistently but FBIF outperforms EIF. EIF proposes improved speed of computation over SciForest algorithm resulting from the use of randomly generated slanted hyperplanes without incurring the additional processing burden to find optimum slanted hyperplanes for each split for branching decision boundary by SciForest but empirical evidence of limitations of slanted hyperplanes as branching boundary in SciForest and EIF in the form of inconsistent anomaly scores for a set of synthetic datasets consisting of spirally distributed in 2-D, single blob of Gaussian in 2-D and two diagonally placed blobs of Gaussian in 2-D have been demonstrated previously. FBIF uses hypersphere as branching boundary to eliminate the inconsistency. We show that input features consisting of standard distribution properties are sufficient to get improved anomaly detection by FBIF, implying that test for standard distribution property is a simple and effective dimensionality reduction technique for anomaly detection in conjunction with FBIF. The combination of hypersphere as a generalized branching decision boundary and the test for standard distribution of input features are presented here as an effective approach by comparing results of FBIF, EIF and IF with a public dataset as input. The results show that FBIF, with only a small number of features displaying standard distribution profile, can outperform IF and EIF, even when the input to IF and EIF consists of greater number of input features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call