Large Margin Distribution Machine (LDM) improves the Support Vector Machine (SVM) by integrating the marginal distribution of samples into the objective function, which exhibits excellent classification and generalization performance. However, most of the existing LDM-based models exhibit inherent flaws: (1) The assumption of a category uniform distribution, resulting in an inability to effectively address the issue of class imbalance. (2) The inheritance of the hinge loss in SVM, leading to inferior anti-noise performance. Given the shortcomings above, this paper constructs a novel K-means Triangular Synthesis (KTS) method for imbalanced data classification and introduces Unified Pinball (UP) loss to ensure robust performance, demonstrated as a novel KTS large margin classifier with UP Loss (KTS-UPLMC) model. Specifically, the KTS method utilizes the triangular synthesis and generates samples based on the cluster density, effectively mitigating the problems of introducing noise samples and strip-shaped sample distribution in the traditional synthetic minority oversampling technique, further alleviating the LDM’s classification line offset. In addition, the UP Loss considers quantile distance, improving the anti-noise performance of the model. Comparative experiments on artificial datasets and benchmark datasets validate the effectiveness and noise resistance of the proposed KTS-UPLMC in imbalanced data classification problems. Furthermore, the stability of the KTS-UPLMC is substantiated by the parameter sensitivity analysis experiment. In conclusion, the KTS method effectively addresses category imbalance, while the integration of the UP Loss enhances noise resistance.
Read full abstract