The high dimensionality of data hinders the learning ability of machine learning algorithms. Feature selection techniques can be used to reduce dimensionality, which is an important step for processing high-dimensional data. Feature selection solves this problem by removing irrelevant and redundant information, which can improve learning models, reduce calculation time, and improve learning accuracy. In this paper, a novel filter in mixed-attribute datasets for feature selection is proposed. The independent attributes are mixed or heterogeneous in the sense that both numerical and categorical attribute types may appear together in the same dataset. Based on the preordonnances theory, we use a new concept to quantify the relevance and redundancy of features even if there are heterogeneous (mixed-type) data. The technique for order preference by similarity to the ideal solution is one of the well-known multicriteria decision-making methods; it is utilized as a weighting and informative feature selection filter. To assess the effectiveness of the proposed method, several experiments, both simulated and real, are performed, including a comparison to other well-known filter methods. The experimental results show that, in most cases, the method yielded competitive results in comparison to other methods.
Read full abstract