Machine Learning Metrics Research Articles

BackgroundMethods to undertake diagnostic accuracy studies of administrative epilepsy data are challenged by lack of a way to reliably rank case-ascertainment algorithms in order of their accuracy. This is because it is difficult to know how to prioritise positive predictive value (PPV) and sensitivity (Sens). Large numbers of true negative (TN) instances frequently found in epilepsy studies make it difficult to discriminate algorithm accuracy on the basis of negative predictive value (NPV) and specificity (Spec) as these become inflated (usually >90%). This study demonstrates the complementary value of using weather forecasting or machine learning metrics critical success index (CSI) or F measure, respectively, as unitary metrics combining PPV and sensitivity. We reanalyse data published in a diagnostic accuracy study of administrative epilepsy mortality data in Scotland. MethodCSI was calculated as 1/[(1/PPV) + (1/Sens) – 1]. F measure was calculated as 2.PPV.Sens/(PPV + Sens). CSI and F values range from 0 to 1, interpreted as 0 = inaccurate prediction and 1 = perfect accuracy. The published algorithms were reanalysed using these and their accuracy re-ranked according to CSI in order to allow comparison to the original rankings. ResultsCSI scores were conservative (range 0.02–0.826), always less than or equal to the lower of the corresponding PPV (range 39%–100%) and sensitivity (range 2%–93%). F values were less conservative (range 0.039–0.905), sometimes higher than either PPV or sensitivity, but were always higher than CSI. Low CSI and F values occurred when there was a large difference between PPV and sensitivity, e.g. CSI was 0.02 and F was 0.039 in an instance when PPV was 100% and sensitivity was 2%. Algorithms with both high PPV and sensitivity performed best in terms of CSI and F measure, e.g. CSI was 0.826 and F was 0.905 in an instance when PPV was 90% and sensitivity was 91%. ConclusionCSI or F measure can combine PPV and sensitivity values into a convenient single metric that is easier to interpret and rank in terms of diagnostic accuracy than trying to rank diagnostic accuracy according to the two measures themselves. CSI or F prioritise instances where both PPV and sensitivity are high over instances where there are large differences between PPV and sensitivity (even if one of these is very high), allowing diagnostic accuracy thresholds based on combined PPV and sensitivity to be determined. Therefore, CSI or F measures may be helpful complementary metrics to report alongside PPV and sensitivity in diagnostic accuracy studies of administrative epilepsy data.

Read full abstract

Equipment to assess muscle mass is not available in all health services. Yet we have limited understanding of whether applying the Global Leadership Initiative on Malnutrition (GLIM) criteria without an assessment of muscle mass affects the ability to predict adverse outcomes. This study used machine learning to determine which combinations of GLIM phenotypic and etiologic criteria are most important for the prediction of 30-day mortality and unplanned admission using combinations including and excluding low muscle mass. In a cohort of 2801 participants from two cancer malnutrition point prevalence studies, we applied the GLIM criteria with and without muscle mass. Phenotypic criteria were assessed using ≥5% unintentional weight loss, body mass index, subjective assessment of muscle stores from the PG-SGA. Aetiologic criteria included self-reported reduced food intake and inflammation (metastatic disease). Machine learning approaches were applied to predict 30-day mortality and unplanned admission using models with and without muscle mass. Participants with missing data were excluded, leaving 2494 for analysis [49.6% male, mean (SD) age: 62.3 (14.2) years]. Malnutrition prevalence was 19.5% and 17.5% when muscle mass was included and excluded, respectively. However, 48 (10%) of malnourished participants were missed if muscle mass was excluded. For the nine GLIM combinations that excluded low muscle mass the most important combinations to predict mortality were (1) weight loss and inflammation and (2) weight loss and reduced food intake. Machine learning metrics were similar in models excluding or including muscle mass to predict mortality (average accuracy: 84% vs. 88%; average sensitivity: 41% vs. 38%; average specificity: 85% vs. 89%). Weight loss and reduced food intake was the most important combination to predict unplanned hospital admission. Machine learning metrics were almost identical in models excluding or including muscle mass to predict unplanned hospital admission, with small differences observed only if reported to one decimal place (average accuracy: 77% vs. 77%; average sensitivity: 29% vs. 29%; average specificity: 84% vs. 84%). Our results indicate predictive ability is maintained, although the ability to identify all malnourished patients is compromised, when muscle mass is excluded from the GLIM diagnosis. This has important implications for assessment in health services where equipment to assess muscle mass is not available. Our findings support the robustness of the GLIM approach and an ability to apply some flexibility in excluding certain phenotypic or aetiologic components if necessary, although some cases will be missed.

Read full abstract

Machine Learning Metrics Research Articles

Related Topics

Articles published on Machine Learning Metrics

A review of model evaluation metrics for machine learning in genetics and genomics.

Birds, bats and beyond: evaluating generalization in bioacoustics models

Deciphering the environmental chemical basis of muscle quality decline by interpretable machine learning models

Whose work matters? A tool for identifying and developing more inclusive physics textbooks

Opportunities to Reduce the Risk of Cardiovascular Death by Improving Machine Learning Methods

Seismically Informed Reference Models Enhance AI‐Based Earthquake Prediction Systems

Robust mortality prediction on a recirculating aquaculture system.

A Novel Brain-Machine Safety System Based on Drowsiness Detection using the PNN and MLP Algorithms

Critical success index or F measure to validate the accuracy of administrative healthcare data identifying epilepsy in deceased adults in Scotland

Machine Learning for Credit Risk Prediction: A Systematic Literature Review

Automated Image Quality and Protocol Adherence Assessment of Examinations in Teledermatology: First Results.

Towards practical object detection for weed spraying in precision agriculture.

Active Object Learning for intelligent social robots

Generating Synthetic Dataset for ML-Based IDS Using CTGAN and Feature Selection to Protect Smart IoT Environments

Improving the quality ofhospital sterilization process using failure modes andeffects analysis, fuzzy logic, andmachine learning: experience intertiary dental centre.

A deep learning based sensor fusion method to diagnose tightening errors

An advanced ensemble modeling approach for predicting carbonate reservoir porosity from seismic attributes

Machine learning models to predict outcomes at 30-days using Global Leadership Initiative on Malnutrition combinations with and without muscle mass in people with cancer.

Network Intrusion Prediction using Machine Learning

Is attention all you need for intraday Forex trading?

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Machine Learning Metrics Research Articles

Related Topics

Articles published on Machine Learning Metrics

A review of model evaluation metrics for machine learning in genetics and genomics.

Birds, bats and beyond: evaluating generalization in bioacoustics models

Deciphering the environmental chemical basis of muscle quality decline by interpretable machine learning models

Whose work matters? A tool for identifying and developing more inclusive physics textbooks

Opportunities to Reduce the Risk of Cardiovascular Death by Improving Machine Learning Methods

Seismically Informed Reference Models Enhance AI‐Based Earthquake Prediction Systems

Robust mortality prediction on a recirculating aquaculture system.

A Novel Brain-Machine Safety System Based on Drowsiness Detection using the PNN and MLP Algorithms

Critical success index or F measure to validate the accuracy of administrative healthcare data identifying epilepsy in deceased adults in Scotland

Machine Learning for Credit Risk Prediction: A Systematic Literature Review

Automated Image Quality and Protocol Adherence Assessment of Examinations in Teledermatology: First Results.

Towards practical object detection for weed spraying in precision agriculture.

Active Object Learning for intelligent social robots

Generating Synthetic Dataset for ML-Based IDS Using CTGAN and Feature Selection to Protect Smart IoT Environments

Improving the quality ofhospital sterilization process using failure modes andeffects analysis, fuzzy logic, andmachine learning: experience intertiary dental centre.

A deep learning based sensor fusion method to diagnose tightening errors

An advanced ensemble modeling approach for predicting carbonate reservoir porosity from seismic attributes

Machine learning models to predict outcomes at 30-days using Global Leadership Initiative on Malnutrition combinations with and without muscle mass in people with cancer.

Network Intrusion Prediction using Machine Learning

Is attention all you need for intraday Forex trading?