Abstract

The classification of high-dimensional tasks remains a significant challenge for machine learning algorithms. Feature selection is considered to be an indispensable preprocessing step in high-dimensional data classification. In the era of big data, there may be hundreds of class labels, and the hierarchical structure of the classes is often available. This structure is helpful in feature selection and classifier training. However, most current techniques do not consider the hierarchical structure. In this paper, we design a feature selection strategy for hierarchical classification based on fuzzy rough sets. First, a fuzzy rough set model for hierarchical structures is developed to compute the lower and upper approximations of classes organized with a class hierarchy. This model is distinguished from existing techniques by the hierarchical class structure. A hierarchical feature selection problem is then defined based on the model. The new model is more practical than existing feature selection approaches, as many real-world tasks are naturally cast in terms of hierarchical classification. A feature selection algorithm based on sibling nodes is proposed, and this is shown to be more efficient and more versatile than flat feature selection. Compared with the flat feature selection algorithm, the computational load of the proposed algorithm is reduced from 98.0% to 6.5%, while the classification performance is improved on the SAIAPR dataset. The related experiments also demonstrate the effectiveness of the hierarchical algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call