Abstract

Class imbalance is a common problem in many real-world machine learning applications that has been shown to significantly degrade classification performance. This is especially true in the context of big data, where large volumes of data from the majority class dominate training processes and bias learning algorithms. Of the various methods for treating class imbalance, output thresholding is one technique that improves classification performance by tuning the decision threshold that is used to assign class labels to class probabilities. While thresholding techniques have been successful, systematic studies within big and imbalanced data applications are limited. In this study, we compare four popular thresholding strategies using two big and imbalanced fraud classification data sets. We focus specifically on tree-based ensemble learners, and employ four popular bagging and boosting ensemble learners that are well known for achieving state-of-the-art performance. Overall classification is measured using the Geometric Mean (G-Mean) and F-Measure metrics and class-wise performance tradeoffs are compared using the true positive rate (TPR) and true negative rate (TNR). The average threshold values of each strategy are compared, and statistical tests are provided to illustrate the importance of careful threshold tuning. Results show that the G-Mean and F-Measure metrics provide misleading results, and careful validation of TPR and TNR is necessary for selecting optimal thresholds. Furthermore, we show how small changes to decision thresholds yield significant changes to classification performance. Our comparison of popular thresholding techniques on both big and highly-imbalanced data makes this a unique contribution in the area of output thresholding with ensemble learners and big data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call