Abstract

Some real world datasets have different proportions of classes, too many instances of the majority classes and only a few of the minority classes, those are called imbalanced datasets. Many applications, like medical diagnosis and risk analysis, are interested in the under-represented class, but classifiers and prototype generation techniques usually have a bias towards the majority classes. Because of that, the problem of classification with imbalanced datasets has become an important topic in Pattern Recognition. The Self-Generating Prototypes (SGP) have a high reduction power and an excelent performance with balanced datasets, but, with imbalanced datasets, the generated prototypes do not have a good representation of the training dataset. This algorithm generates many prototypes of the majority classes and only a few, or even none, of the minority classes. The aim of this paper is to propose the Adaptive Self-Generating Prototypes (ASGP), an improvement of the SGP2, the second version of the SGP, designed to handle imbalanced datasets. This paper also exposes the reasons for the low performance of the SGP2 with such datasets. Empirical results show that the ASGP has a higher performance with imbalanced datasets than the SGP2, especially when it comes to classification accuracy of the minority classes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.