Abstract
Abstract Background Researchers are devoting significant effort to use machine learning algorithms, a subset of the wider field of artificial intelligence, to detect disease in a single patient. There exists extensive research in the application of machine learning methods in health care, and more specifically, cardiovascular disease. We have chosen to focus this initial investigation on the case of cardiac disease in order to focus our efforts on as much detail of the methods as possible. Methods In this paper we explore the uncertainty that exists across applying machine learning methods, namely: Support Vector Machines (SVM), Multi-Layer Perception Neural Networks (MLP) and ensemble methods, for the classification of cardiovascular disease. Our work uses two public datasets with significantly different characteristics in order to assess the potential differences in the uncertainty of the methods. The cardiac arrhythmia dataset from the University California Irvine (UCI) Machine Learning repository has almost three hundred specific physiological data points per patient gathered from analysis of electrocardiogram (ECG) signals on several hundred patients although the distribution of cases is severely imbalanced. Contrast this with one dataset, reporting on cardiovascular disease from the Kaggle collection where there are nearly seventy thousand patient records. However, this Kaggle dataset reports only a small number of parameters per patient record, values such as serum cholesterol level, diastolic and systolic blood pressure, relative blood glucose levels and presence or absence of angina. Results Models built for the UCI dataset have an order of magnitude more dimensions or alternatively have much larger numbers of input nodes for neural network models compared to the models developed the Kaggle dataset. On the other hand, the Kaggle dataset has an order of magnitude more records for training and validation than the UCI dataset. Our results compare and contrast the uncertainty in models built using support vector machine, multilayer perceptron neural networks and decision trees for these two datasets. The work suggests that it will be instructive to extend our analysis to datasets of other patho-physiologies.
Highlights
In this paper we explore the uncertainty that exists across applying machine learning methods, namely: Support Vector Machines (SVM), Multi-Layer Perception Neural Networks (MLP) and ensemble methods, for the classification of cardiovascular disease
There is a general trend toward digital health care, a feature that is being driven in part by the Internet of Things (IoT) and enhanced sensing technologies both coupled to the use of artificial intelligence and machine learning methods
In the first part of this section we discuss the results obtained with the SVM kernel and move on to report on the results obtained from the multi-layer perceptron neural network models
Summary
There is a general trend toward digital health care, a feature that is being driven in part by the Internet of Things (IoT) and enhanced sensing technologies both coupled to the use of artificial intelligence and machine learning methods. Researchers are devoting significant effort to use machine learning algorithms, a subset of the wider field of artificial intelligence, to detect disease in a single patient. The cardiac arrhythmia dataset from the University California Irvine (UCI) Machine Learning repository has almost three hundred specific physiological data points per patient gathered from analysis of electrocardiogram (ECG) signals on several hundred patients the distribution of cases is severely imbalanced. Contrast this with one dataset, reporting on cardiovascular disease from the Kaggle collection where there are nearly seventy thousand patient records. The work suggests that it will be instructive to extend our analysis to datasets of other patho-physiologies
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.