Abstract
Coronary artery disease (CAD) is one of the leading causes of death globally. Angiography is one of the benchmarked diagnoses for detection of CAD; however, it is costly, invasive, and requires a high level of technical expertise. This paper discusses a data mining technique that uses noninvasive clinical data to identify CAD cases. The clinical data of 335 subjects were collected at the cardiology department, Indira Gandhi Medical College, Shimla, India, over the period of 2012–2013. Only 48.9% subjects showed coronary stenosis in coronary angiography and were confirmed cases of CAD. A large number of cases (171 out of 335) were found normal after invasive diagnosis. Hence, a requirement of noninvasive technique was felt that could identify CAD cases without going for invasive diagnosis. We applied data mining classification techniques on noninvasive clinical data. The data set is analyzed using a hybrid and novel k-means cluster centroid-based method for missing value imputation and C4.5, NB Tree and multilayer perceptron for modeling to predict CAD patients. The proposed hybrid method increases the accuracy achieved by the basic techniques of classification. This framework is a promising tool for screening CAD and its severity with high probability and low cost.
Highlights
Cardiovascular diseases (CVD) are due to disorders of the heart and blood vessels [1]
Various epidemiological studies have been done in the past including Framingham Heart study [10,11], Nippon–Honolulu–San Francisco study [12,13], Monitoring Trends and Determinants in Cardiovascular Disease [14,15], INTERHEART study [16,17] for understanding the patterns, cause and risk factors for the disease
We propose an intelligent machine learning framework for Coronary artery disease (CAD) prediction (Fig. 1)
Summary
Cardiovascular diseases (CVD) are due to disorders of the heart and blood vessels [1]. Decision tree [22,23,24,25,26,27,28], support vector machine (SVM) [24,25,27], artificial neural networks (ANN) [24,25,27,28], Naïve Bayes [28], Bayesian Networks [25], have been used for CVD diagnosis as black box and models generated were not clinically interpretable. Models were constructed using supervised learning algorithms: C4.5, NB Tree and MLP for diagnosis of CAD and its severity The models are trained and validated using k-fold cross-validation method, where all the samples are eventually used for both training and testing In this method, data set is divided into k equal size subsets where k = 10 and k − 1 data subsets are used to train the model and remaining subset is used to test the model. Accuracy—accuracy is a measure of the percent of correctly classified objects by the classification method: Accuracy
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.