Abstract

Outlier detection in high dimensional data faces the challenge of curse of dimensionality, where irrelevant features may prevent detection of outliers. In this research, we propose a novel efficient unsupervised density-based subspace selection for outlier detection in the projected subspace. First, the Maximum-Relevance-to-Density algorithm(MRD) is proposed to select the relevant subspace based on the mutual information. Then, applying the concept of redundancy among features, we present an efficient relevant subspace selection method called minimum-Redundancy-Maximum-Relevance-to-Density (mRMRD). Finally, the degree of outlierness of data points in the corresponding relevant subspace is computed based on Local Outlier Factor(LOF). Experimental results on both real and synthetic data demonstrate that the proposed algorithms – based on MRD and mRMRD criteria – increase the accuracy of outlier detection while reducing computational complexity and execution time. Moreover, as the dimensionality increases, the accuracy of outlier detection on mRMRD-based relevant subspace is higher than MRD-based relevant subspace. This verifies that the proposed mRMRD-based subspace selection algorithm can efficiently select the subspace by considering the relevance between features.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.