Abstract

The goal of the paper is to develop a new algorithm for predicting whether the company will go bankrupt on the base of unbalanced data. To do it, we propose to consider the classification as a multi-objective optimization problem and construct a prediction model as an ensemble while minimizing the parameters FPR (False Positive Rate) and FNR (False Negative Rate) at the same time. To create the ensemble, the proposed algorithm of a Multi-Objective Classifier Selection (MOCS) selects only classifiers that belong to the Pareto-optimal set in FPR/FNR space; that is, there is no dominance between them, and they satisfy some additional conditions. In the general case, MOCS is determined by three parameters: two threshold values that limit false rates (FNR and FPR), and the crowding distance, which defines the uniqueness of the classifier's results. We tested the proposed algorithm on data collected from 2457 Russian companies, 456 of which went bankrupt, and 5910 Polish companies, 410 of which received bankruptcy status. Datasets contain features such as financial ratios and business environment factors. In the testing, we used more than 70 combinations of under-sampling, over-sampling, and no sampling methods with static and dynamic classification models. Final ensembles include seven classifiers for the Russian dataset and four classifiers for the Polish dataset combined by soft voting rule. In both cases, the proposed algorithm produces a significant improvement of prediction results as in terms of standard metrics (geometric mean, the area under the ROC curve) and in the visual representation in the FNR/FPR space, namely in the shift from a Pareto-optimal set of classifiers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call