Abstract
A framework for clinical diagnosis which uses bioinspired algorithms for feature selection and gradient descendant backpropagation neural network for classification has been designed and implemented. The clinical data are subjected to data preprocessing, feature selection, and classification. Hot deck imputation has been used for handling missing values and min-max normalization is used for data transformation. Wrapper approach that employs bioinspired algorithms, namely, Differential Evolution, Lion Optimization, and Glowworm Swarm Optimization with accuracy of AdaBoostSVM classifier as fitness function has been used for feature selection. Each bioinspired algorithm selects a subset of features yielding three feature subsets. Correlation-based ensemble feature selection is performed to select the optimal features from the three feature subsets. The optimal features selected through correlation-based ensemble feature selection are used to train a gradient descendant backpropagation neural network. Ten-fold cross-validation technique has been used to train and test the performance of the classifier. Hepatitis dataset and Wisconsin Diagnostic Breast Cancer (WDBC) dataset from University of California Irvine (UCI) Machine Learning repository have been used to evaluate the classification accuracy. An accuracy of 98.47% is obtained for Wisconsin Diagnostic Breast Cancer dataset, and 95.51% is obtained for Hepatitis dataset. The proposed framework can be tailored to develop clinical decision-making systems for any health disorders to assist physicians in clinical diagnosis.
Highlights
Knowledge discovery plays a vital role in extracting knowledge from clinical databases
For classification ANN, PS classifier and GA classifier were used in this study. e idea was tested using Wisconsin Breast Cancer (WBC) dataset, Wisconsin Diagnostic Breast Cancer (WDBC) dataset, and Wisconsin Prognosis Breast Cancer (WPBC) dataset. e results from the experiments show that the proposed feature selection algorithm improves the accuracy of the classifier. e results were compared with WBC, WDBC, and WPBC datasets. e accuracy for these datasets was 96.6%, 96.6%, and 78.1%, respectively, using the GA classifier
15 16 11 13 14 17 18 12 10 9 8 7 6 1 2 5 4 e proposed work selects relevant attributes using the wrapper approach based on the three bioinspired algorithms, namely, differential evolution, Lion Optimization, and Glowworm Swarm Optimization, keeping the accuracy of the AdaBoostSVM classifier as fitness function. e wrapper approach selects features which are tied to a learning algorithm and depends on the performance of the classifier. ey do not depend on the values of the statistical class separability measure. e selected features using Differential Evolution, Glowworm Swarm Optimization, Lion Optimization, and Correlation-based feature selector for both datasets are shown in Tables 7 and 8
Summary
Knowledge discovery plays a vital role in extracting knowledge from clinical databases. E goal of PDM in healthcare is to build models from electronic health records that use patient specific data to predict the outcome of interest and support clinicians in decision-making. A framework for knowledge mining from clinical datasets using rough sets for feature selection and classification using backpropagation neural network has been proposed in [3]. E embedded method first incorporates the statistical criteria, as filter model does, to select several candidate features subsets with a given cardinality It chooses the subset with the highest classification accuracy [19]. A framework for clinical diagnosis which uses bioinspired algorithms for feature selection and gradient descendant backpropagation neural network for classification has been designed and implemented.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have