BrightBox — A rough set based technology for diagnosing mistakes of machine learning models

Andrzej Janusz,Andżelika Zalewska,Łukasz Wawrowski,Piotr Biczyk,Jan Ludziejewski,Marek Sikora,Dominik Ślęzak

doi:10.1016/j.asoc.2023.110285

Abstract

The paper presents a novel approach to investigating mistakes in machine learning model operations. The considered approach is the basis for BrightBox – a diagnostic technology that can be used for analyzing prediction models and identifying model- and data-related issues. The idea is to generate surrogate rough set-based models from data that approximate decisions made by monitored black-box models. Such approximators are used to compute neighborhoods of instances that undergo the diagnostic process — the neighborhoods consist of historical instances that were processed in a similar way by rough set-based models. The diagnostic process is then based on the analysis of mistakes registered in such neighborhoods. The experiments performed on real-world data sets confirm that such analysis can provide us with efficient and valid insights about the reasons for the poor performance of machine learning models.

Full Text