An algorithmic approach to identification of gray areas: Analysis of sleep scoring expert ensemble non agreement areas using a multinomial mixture model

Gabriel Jouan,Erna Sif Arnardottir,Anna Sigridur Islind,María Óskarsdóttir

doi:10.1016/j.ejor.2023.09.039

Abstract

Machine learning (ML) models have become a key component in modern world services. In decision-making domains where human expertise is crucial, for example, for manually scoring biological signal data, human uncertainties undermine experts’ trust in the outcomes of these models. The field of sleep staging in particular, which requires experts to score complex biological signal is notably impacted by scoring uncertainties. Data consisting of an ensemble of independent scorers are collected to estimate inter scorer agreement and the uncertainty associated with manual scoring. However, scorers’ uncertainty lacks statistical modeling, which poses difficulties in validating ML algorithms and leads to issues of reliability and explainability. From the ensemble of scorers, uncertainty zones, called gray areas, are highlighted by samples where scorers disagree. The objective of our work is to provide a framework introducing and inferring gray areas. We present a flexible and easy-to-use multi-objective method based on multinomial mixture models clustering the different levels of scorer agreement and summarize the results into two sets of high-agreement and gray area clusters, which are called supra-clusters. The threshold is selected according to the maximization of the distance between two distributions of scorers agreement measure. Effective results were obtained by the method after it was fitted on simulated data. Additionally, the method was applied to a real case of uncertainty analysis in the sleep staging domain. A series of actual sleep stages scored by an ensemble of 10 independent scorers for a dataset of 50 participants was used.

Full Text