A New Unsupervised Predictive-Model Self-Assessment Approach That SCALEs

Francesco Ventura,Elena Baralis,Daniele Apiletti,Enrico Macii,Stefano Proto,Alberto Macii,Tania Cerquitelli,Simone Panicucci

doi:10.1109/bigdatacongress.2019.00033

Abstract

Evaluating the degradation of predictive models over time has always been a difficult task, also considering that new unseen data might not fit the training distribution. This is a well-known problem in real-world use cases, where collecting the historical training set for all possible prediction labels may be very hard, too expensive or completely unfeasible. To solve this issue, we present a new unsupervised approach to detect and evaluate the degradation of classification and prediction models, based on a scalable variant of the Silhouette index, named Descriptor Silhouette, specifically designed to advance current Big Data state-of-the-art solutions. The newly proposed strategy has been tested and validated over both synthetic and real-world industrial use cases. To this aim, it has been included in a framework named SCALE and resulted to be efficient and more effective in assessing the degradation of prediction performance than current state-of-the-art best solutions.

Full Text