Supervised Stratified Subsampling for Predictive Analytics

Ming-Chung Chang

doi:10.1080/10618600.2024.2304075

Supervised Stratified Subsampling for Predictive Analytics

Ming-Chung Chang

https://doi.org/10.1080/10618600.2024.2304075

Copy DOI

Journal: Journal of Computational and Graphical Statistics

Publication Date: Jan 9, 2024

#Supervised Learning Framework #Desirable Robustness + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Predictive analytics involves the use of statistical models to make predictions; however, the power of these techniques is hindered by ever-increasing quantities of data. The richness and sheer volume of big data can have a profound effect on computation time and/or numerical stability. In the current study, we develop a novel approach to subsampling with the aim of overcoming this issue when dealing with regression problems in a supervised learning framework. The proposed method integrates stratified sampling and is model-independent. We assess the theoretical underpinnings of the proposed subsampling scheme, and demonstrate its efficacy in yielding reliable predictions with desirable robustness when applied to different statistical models.

Full Text