Hybrid approach using fuzzy sets and extreme learning machine for classifying clinical datasets

Kindie Biredagn Nahato,A Kannan,Khanna H Nehemiah

doi:10.1016/j.imu.2016.01.001

Kindie Biredagn Nahato, A Kannan + Show 1 more

Open Access

https://doi.org/10.1016/j.imu.2016.01.001

Copy DOI

Journal: Informatics in medicine unlocked	Publication Date: Jan 1, 2016
Citations: 36	License type: cc-by-nc-nd

Affiliation: Anna University, Chennai

Abstract

Data mining techniques play a major role in developing computer aided diagnosis systems and expert systems that will aid a physician in clinical decision making. In this work, a classifier that combines the relative merits of fuzzy sets and extreme learning machine (FELM) for clinical datasets is proposed. The three major subsystems in the FELM framework are preprocessing subsystem, fuzzification subsystem and classification subsystem. Missing value imputation and outlier elimination are handled by the preprocessing subsystem. The fuzzification subsystem maps each feature to a fuzzy set and the classification subsystem uses extreme learning machine for classification. Cleveland heart disease (CHD), Statlog heart disease (SHD) and Pima Indian diabetes (PID) datasets from the University of California Irvine (UCI) machine learning repository have been used for experimentation. The CHD and SHD datasets have been experimented with two class labels one indicating the absence and the other indicating the presence of heart disease. The CHD dataset has also been experimented with five class labels, one class label indicating the absence of heart disease and the other four class labels indicating the severity of heart disease namely low risk, medium risk, high risk and serious. The PID data set has been experimented with two class labels one indicating the absence and the other indicating the presence of gestational diabetes. The classifier has achieved an accuracy of 93.55% for CHD data set with two class labels; 73.77% for CHD data set with five class labels; 94.44% for SHD data set and 92.54% for PID dataset.

Full Text