Abstract

In this paper we propose a novel decision support framework based on deep learning for cardiovascular disease prediction. The proposed framework based on an innovative stacked dense neural layer and convolution neural network cascade architecture, addresses the significant imbalance in class distribution in CVD event detection task. The experimental evaluation of the proposed model was done on the NHANES super-dataset, obtained by fusion of different subsets of publicly NHANES (National Health and Nutrition Examination Survey) data for prediction of cardiovascular disease. Many machines and deep learning models have been proposed in the literature for CVD event detection. However, they assume balanced class distribution between positive and negative disease classes. For clinical settings, there is significant class imbalance, with few positive class samples as compared to abundant samples from normal or control class. Hence most of the traditional machine and deep learning models are vulnerable to class imbalance, even after using class-specific adjustment of weights (well established method for handling class imbalance) and can lead to poor performance for the minority class detection. The proposed model based on stacked-Dense-CNN cascade architecture is robust and resilient to the class imbalance and has better overall detection accuracy. The first stage of the stacked-Dense-CNN cascade consists of an optimal feature learning stage, comprising a LASSO (least absolute shrinkage and selection) and majority voting step, for extraction of significant and homogenized features. The second stage use of a novel stacked-Dense-CNN cascade model and a novel model development protocol involving an unique train-test dataset partitioning strategy. Also, by using a specific training routine per epoch, similar to the simulated annealing approach, it was possible to achieve enhanced detection performance, particularly for detection of minority class, and robustness to class imbalance. The experimental evaluation of the novel stacked-Dense-CNN cascade model on a super dataset obtained by fusing multiple data subsets of publicly available NHANES data, resulted in an accuracy of 81.8% accuracy for negative CVD cases (majority class), and 85% for the positive CVD cases (minority class), an improved performance as compared to previously proposed research approaches for imbalanced clinical data settings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call