Fisher Consistent Cost Sensitive Boosting Algorithm

Ying Cao,Qi-Guang Miao,Lin Gao,Jia-Chen Liu

doi:10.3724/sp.j.1001.2013.04485

Abstract

AdaBoost is a meta ensemble learning algorithm. The most important theoretical property behind it is "Boosting", which also plays an important role in cost sensitive learning. However, available cost sensitive Boosting algorithms, such as AdaCost, AdaC1, AdaC2, AdaC3, CSB0, CSB1 and CSB2, are just heuristic. They add cost parameters into voting weight calculation formula or sample weights updating strategy of AdaBoost, so that the algorithms are forced to focus on samples with higher misclassification costs. However, these heuristic modifications have no theoretical foundations. The worst thing is that they break the most important theoretical property of AdaBoost, namely "Boosting". Compared to AdaBoost which converges to optimal Bayes decision rule, those cost sensitive algorithms do not converge to cost sensitive decision rule. This paper studies the problem of designing cost sensitive Boosting algorithms strictly under Boosting theory. First, two new loss functions are constructed by making exponential loss and logit loss cost sensitive. It can be proved ∗ 基金项目: 国家自然科学基金(61072109, 61272280, 41271447, 61272195); 新世纪优秀人才支持计划(NCET-12-0919); 西安市科技局项目(CXY1341(6)); 中央高校基本科研业务费专项资金(K5051203020, K5051203001, K5051303016, K5051303018, K505131 00006)

Full Text