Machine learning prediction of susceptibility to visceral fat associated diseases

M Aldraimli,D Soria,J Parkinson,M V Dwek,E L Thomas,J D Bell,T J Chaussalet

doi:10.1007/s12553-020-00446-1

Abstract

Classifying subjects into risk categories is a common challenge in medical research. Machine Learning (ML) methods are widely used in the areas of risk prediction and classification. The primary objective of such algorithms is to use several features to predict dichotomous responses (e.g., healthy/at risk). Similar to statistical inference modelling, ML modelling is subject to the problem of class imbalance and is affected by the majority class, increasing the false-negative rate. In this study, we built and evaluated thirty-six ML models to classify approximately 4300 female and 4100 male participants from the UK Biobank into three categorical risk statuses based on discretised visceral adipose tissue (VAT) measurements from magnetic resonance imaging. We also examined the effect of sampling techniques on the models when dealing with class imbalance. The sampling techniques used had a significant impact on the classification and resulted in an improvement in risk status prediction by facilitating an increase in the information contained within each variable. Based on domain expert criteria the best three classification models for the female and male cohort visceral fat prediction were identified. The Area Under Receiver Operator Characteristic curve of the models tested (with external data) was 0.78 to 0.89 for females and 0.75 to 0.86 for males. These encouraging results will be used to guide further development of models to enable prediction of VAT value. This will be useful to identify individuals with excess VAT volume who are at risk of developing metabolic disease ensuring relevant lifestyle interventions can be appropriately targeted.

Highlights

Real-world data are often imbalanced and lack uniform distribution across classes
Of all methods were computed, they showed that resampling methods resulted in an improvement in Classified Instances ratio (CCI) compared to the original Targeted dataset (TD)
When the performance of the Logistic Regression (LR), Artificial neural network (ANN), C4.5 and Random Forest (RF) models for the female cohort was evaluated, it was apparent that the Random Under Sampling (RUS) dataset was poorer than when the TD data set was used, Fig. 9

Summary

Introduction

Real-world data are often imbalanced and lack uniform distribution across classes. Classification of imbalanced datasets is a significant challenge across both industrial and research domains [1]. When resampling methods are applied, questions over their suitability are often raised [9]. For example: is the new resampled dataset representative of the population in relation to the response variable? Is it acceptable to artificially generate synthetic data of class subjects when training Machine Learning (ML) classification models? It has been argued that by using sampling methods, the original class ratio is lost during the training process and that this affects the accuracy metrics [10]. Training ML models with synthetic data may compromise accuracy measures by deceiving the process of crossvalidation sampling [11]

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Health and Technology	Publication Date: Jul 1, 2020
Citations: 9	License type: open-access

R Discovery Prime

R Discovery Prime

Machine learning prediction of susceptibility to visceral fat associated diseases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Health and Technology

Lead the way for us

Similar Papers

This Month in Gastroenterology
Jan Tack ... John M Carethers
Gastroenterology | VOL. 133
Jan Tack, et. al.Jan Tack ... John M Carethers
01 Aug 2007
Gastroenterology | VOL. 133

Visceral Adipose Tissue Attacks Beyond the Liver: Esophagogastric Junction as a New Target
Herbert Tilg ... Alexander R Moschen
Gastroenterology | VOL. 139
Herbert Tilg, et. al.Herbert Tilg ... Alexander R Moschen
23 Oct 2010
Gastroenterology | VOL. 139

A data science approach for early-stage prediction of Patient's susceptibility to acute side effects of advanced radiotherapy
Mahmoud Aldraimli ... Maria Carmen De Santis
Computers in Biology and Medicine | VOL. 135
Mahmoud Aldraimli, et. al.Mahmoud Aldraimli ... Maria Carmen De Santis
05 Jul 2021
Computers in Biology and Medicine | VOL. 135

Higher Free Fatty Acid Uptake in Visceral Than in Abdominal Subcutaneous Fat Tissue in Men
Jarna C Hannukainen ... Tuula Janatuinen
Obesity | VOL. 18
Jarna C Hannukainen, et. al.Jarna C Hannukainen ... Tuula Janatuinen
01 Feb 2010
Obesity | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine learning prediction of susceptibility to visceral fat associated diseases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Health and Technology