Abstract

Imbalanced data is a challenge for classification models. It reduces the overall performance of traditional learning algorithms. Besides, the minority class of imbalanced datasets is misclassified with a high ratio even though this is a crucial object of the classification process. In this paper, a new model called the Lasso-Logistic ensemble is proposed to deal with imbalanced data by utilizing two popular techniques, random over-sampling and random under-sampling. The model was applied to two real imbalanced credit data sets. The results show that the Lasso-Logistic ensemble model offers better performance than the single traditional methods, such as random over-sampling, random under-sampling, Synthetic Minority Oversampling Technique (SMOTE), and cost-sensitive learning.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium provided the original work is properly cited.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.