Methods:Feature selection is essential for building effective machine learning models in binary classification. Eliminating unnecessary features can reduce the risk of overfitting and improve classification performance. Moreover, the data we handle typically contains a stochastic component, making it important to develop robust models that are insensitive to data perturbations. Although there are numerous methods and tools for feature selection, relatively few studies address embedded feature selection within robust classification models using penalization techniques. Objective:In this work, we introduce robust classifiers with integrated feature selection capabilities, utilizing probability machines based on different penalization techniques, such as the ℓ1-norm or the elastic-net, combined with a novel Direct Feature Elimination process to improve model resilience and efficiency. Findings:Numerical experiments on standard datasets demonstrate the effectiveness and robustness of the proposed models in classification tasks even when using a reduced number of features. These experiments were evaluated using original performance indicators, highlighting the models’ ability to maintain high performance with fewer features. Novelty:The study discusses the trade-offs involved in combining different penalties to select the most relevant features while minimizing empirical risk. In particular, the integration of elastic-net and ℓ1-norm penalties within a unified framework, combined with the original Direct Feature Elimination approach, presents a novel method for improving both model accuracy and robustness.
Read full abstract