Cost-sensitive learning is a popular paradigm to address class-imbalance learning (CIL) problem. Traditional cost-sensitive learning approaches always solve CIL problem by assigning a constant higher training error penalty for all minority instances than that of majority instances, but ignore the significance of location information. Therefore, several recent studies began to focus on the personalized cost assignment, i.e., designating different costs for different instances based on their location information. The emerging personalized cost-sensitive approaches always perform better than those traditional ones; however, the estimation for location information may be inaccurate as it is apt to be impacted by data density variation. To address this problem, we propose a novel location information estimation and cost assignment strategy called RUE. Unlike previous approaches, our proposed strategy explores location information by an indirect way: the error rate feed backed from a random undersampling ensemble. The strategy is robust towards data distribution, and is helpful for accurately estimating the significance of each instance regardless the complexity of data distribution. In context of Fuzzy Support Vector Machine (FSVM) and Weighted Extreme Learning Machine (WELM), the proposed cost assignment strategy is compared with several popular and state-of-the-art approaches, and the results show its effectiveness and superiority.
Read full abstract