Measuring the prediction difficulty of individual cases in a dataset using machine learning

Hyunjin Kwon,Matthew Greenberg,Joon Lee,Colin Bruce Josephson

doi:10.1038/s41598-024-61284-z

Abstract

Different levels of prediction difficulty are one of the key factors that researchers encounter when applying machine learning to data. Although previous studies have introduced various metrics for assessing the prediction difficulty of individual cases, these metrics require specific dataset preconditions. In this paper, we propose three novel metrics for measuring the prediction difficulty of individual cases using fully-connected feedforward neural networks. The first metric is based on the complexity of the neural network needed to make a correct prediction. The second metric employs a pair of neural networks: one makes a prediction for a given case, and the other predicts whether the prediction made by the first model is likely to be correct. The third metric assesses the variability of the neural network’s predictions. We investigated these metrics using a variety of datasets, visualized their values, and compared them to fifteen existing metrics from the literature. The results demonstrate that the proposed case difficulty metrics were better able to differentiate various levels of difficulty than most of the existing metrics and show constant effectiveness across diverse datasets. We expect our metrics will provide researchers with a new perspective on understanding their datasets and applying machine learning in various fields.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Measuring the prediction difficulty of individual cases in a dataset using machine learning

Abstract

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Journal: Scientific Reports	Publication Date: May 7, 2024
License type: CC BY 4.0

Similar Papers

Machine learning in pain research.
Jörn Lötsch ... Alfred Ultsch
Pain | VOL. 159
Jörn Lötsch, et. al.Jörn Lötsch ... Alfred Ultsch
24 Nov 2017
Pain | VOL. 159

Developing Feedforward Neural Networks as Benchmark for Load Forecasting: Methodology Presentation and Application to Hospital Heat Load Forecasting
Malte Stienecker ... Anne Hagemeier
Energies | VOL. 16
Malte Stienecker, et. al.Malte Stienecker ... Anne Hagemeier
18 Feb 2023
Energies | VOL. 16

HadoopCL2: Motivating the Design of a Distributed, Heterogeneous Programming System With Machine-Learning Applications
Max Grossman ... Vivek Sarkar
IEEE Transactions on Parallel and Distributed Systems | VOL. 27
Max Grossman, et. al.Max Grossman ... Vivek Sarkar
01 Mar 2016
IEEE Transactions on Parallel and Distributed Systems | VOL. 27

Tool Support for Improving Software Quality in Machine Learning Programs
Kwok Sun Cheng ... Pei-Chi Huang
Information | VOL. 14
Kwok Sun Cheng, et. al.Kwok Sun Cheng ... Pei-Chi Huang
16 Jan 2023
Information | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Measuring the prediction difficulty of individual cases in a dataset using machine learning

Abstract

Talk to us

Similar Papers

More From: Scientific Reports