Abstract

Similarly to binary classification methods, one-class classification methods could benefit from feature selection. However, the feature selection algorithms for the binary or multi-class are not applicable to one-class classification situations since only one class of instances is provided. Few techniques have been proposed so far for feature selection in one-class classification. This paper focuses on designing a filter-based feature selection method for one-class classification. Our approach is based on the observation that for some tasks such as outlier detection, anomaly detection, the training data (normal data) may contain multiple sub-concepts. The sub-concept is a source of data complexity. Our approach aims at searching the features that characterize the instances of the sub-concepts more compact, so as to reduce the data complexity. It firstly finds the sub-concepts using a clustering algorithm with a fixed cluster number and then applies combined feature measures to evaluate the relevance between each feature and the sub-concepts. A fixed number of features—those with the highest relevance scores—are selected as a feature subset. In the searching process, the Davies–Bouldin Index is used to assess the data complexity on the sub-concepts obtained with different number of clusters. The feature subset with the lowest DBI is selected as the final feature subset. Experiments on UCI benchmark and cyber security datasets demonstrate that our feature selection algorithm can select relevant features and improve the performance of one-class classification on multimodal data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.