Abstract
Feature selection methods face new challenges in large-scale classification tasks because massive categories are managed in a hierarchical structure. Hierarchical feature selection can take full advantage of the dependencies among hierarchically structured classes. However, most of the existing hierarchical feature selection methods are not robust for dealing with the inevitable data outliers, resulting in a serious inter-level error propagation problem in the following classification process. In this paper, we propose a robust hierarchical feature selection method with a capped ℓ2-norm (HFSCN), which can reduce the adverse effects of data outliers and learn relatively robust and discriminative feature subsets for the hierarchical classification process. Firstly, a large-scale global classification task is split into several small local sub-classification tasks according to the hierarchical class structure and the divide-and-conquer strategy, which makes it easy for feature selection modeling. Secondly, a capped ℓ2-norm based loss function is used in the feature selection process of each local sub-classification task to eliminate the data outliers, which can alleviate the negative effects outliers and improve the robustness of the learned feature weighted matrix. Finally, an inter-level relation constraint based on the similarity between the parent and child classes is added to the feature selection model, which can enhance the discriminative ability of the selected feature subset for each sub-classification task with the learned robust feature weighted matrix. Compared with seven traditional and state-of-art hierarchical feature selection methods, the superior performance of HFSCN is verified on 16 real and synthetic datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.