Abstract

Feature selection is one of the trending challenges in multi-label classification. In recent years a lot of methods have been proposed. However the existing approaches assume that all the features have the same cost. This assumption may be inappropriate when the acquisition of the feature values is costly. For example in medical diagnosis each diagnostic value extracted by a clinical test is associated with its own cost. In such cases it may be better to choose a model with an acceptable classification performance but a much lower cost. We propose a novel method which incorporates the feature cost information into the learning process. The method, named Cost-Sensitive Classifier Chains, combines classifier chains and penalized logistic regression with a modified elastic-net penalty which takes into account costs of the features. We prove the stability and provide a bound on generalization error of our algorithm. We also propose the adaptive version in which penalty factors are changing during fitting the consecutive models in the chain. The methods are applied on real datasets: MIMIC-II and Hepatitis for which the cost information is provided by experts. Moreover, we propose an experimental framework in which the features are observed with measurement errors and the costs depend on the quality of the features. The framework allows to compare the cost-sensitive methods on benchmark datasets for which the cost information is not provided. The proposed method can be recommended in a situation when one wants to balance low costs and high prediction performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call