Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project

Sherif Sakr,Steven J Keteyian,Radwa Elshawi,Amjad M Ahmed,Clinton A Brawner,Michael J Blaha,Waqas T Qureshi,Mouaz H Al-Mallah

doi:10.1186/s12911-017-0566-6

Abstract

BackgroundPrior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluation and comparison of how machine learning techniques can be applied on medical records of cardiorespiratory fitness and how the various techniques differ in terms of capabilities of predicting medical outcomes (e.g. mortality).MethodsWe use data of 34,212 patients free of known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems Between 1991 and 2009 and had a complete 10-year follow-up. Seven machine learning classification techniques were evaluated: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network (BN), K-Nearest Neighbor (KNN) and Random Forest (RF). In order to handle the imbalanced dataset used, the Synthetic Minority Over-Sampling Technique (SMOTE) is used.ResultsTwo set of experiments have been conducted with and without the SMOTE sampling technique. On average over different evaluation metrics, SVM Classifier has shown the lowest performance while other models like BN, BC and DT performed better. The RF classifier has shown the best performance (AUC = 0.97) among all models trained using the SMOTE sampling.ConclusionsThe results show that various ML techniques can significantly vary in terms of its performance for the different evaluation metrics. It is also not necessarily that the more complex the ML model, the more prediction accuracy can be achieved. The prediction performance of all models trained with SMOTE is much better than the performance of models trained without SMOTE. The study shows the potential of machine learning methods for predicting all-cause mortality using cardiorespiratory fitness data.

Highlights

Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health
The study shows the potential of machine learning methods for predicting all-cause mortality using cardiorespiratory fitness data
Machine learning classification techniques In our experiments, we studied the following seven popular ML classification techniques: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network

Summary

Introduction

Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluation and comparison of how machine learning techniques can be applied on medical records of cardiorespiratory fitness and how the various techniques differ in terms of capabilities of predicting medical outcomes (e.g. mortality). ML algorithms automatically scan and analyze all predictor variables in a way that prevents overlooking potentially important predictor variables even if it was unexpected. It has been acknowledged as a powerful tool which dramatically changes the mode and accessibility of science, research and practice in all domains [4]. Medicine and Healthcare are no different [5,6,7]

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: Dec 1, 2017
Citations: 67	License type: open-access

R Discovery Prime

Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project.
Manal Alghamdi ... Mouaz Al-Mallah
PLOS ONE | VOL. 12
Manal Alghamdi, et. al.Manal Alghamdi ... Mouaz Al-Mallah
24 Jul 2017
PLOS ONE | VOL. 12

A Comparative Analysis of Machine Learning Methods for Class Imbalance in a Smoking Cessation Intervention
Khishigsuren Davagdorj ... Van Huy Pham
Applied Sciences | VOL. 10
Khishigsuren Davagdorj, et. al.Khishigsuren Davagdorj ... Van Huy Pham
09 May 2020
Applied Sciences | VOL. 10

Synthetic oversampling based decision support framework to solve class imbalance problem in smoking cessation program
...
International Journal of Applied Science and Engineering | VOL. 17
, et. al. ...
01 Sep 2020
International Journal of Applied Science and Engineering | VOL. 17

Comparative Multinomial Text Classification Analysis of Naïve Bayes and XGBoost with SMOTE on Imbalanced Dataset
Ashish Chaturvedi ... Santosh Yadav
-
Ashish Chaturvedi, et. al.Ashish Chaturvedi ... Santosh Yadav
05 Sep 2021
05 Sep 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making