Abstract

AbstractMultidimensional data sets are becoming more common in almost all research domains, and extracting insight from them necessitates complicated data analysis approaches. The increased dimensionality of data has made data mining and machine learning difficult. Early disease forecasts assist physicians to make effective decisions to save patients’ survival. As a result, dimensionality reduction approaches provide a roadmap for resolving this issue, in terms of efficiency and effectiveness by reducing unnecessary, irrelevant, and noisy data, making the learning process faster in terms of computation time and accuracy. For medical data, this paper proposes a Feature Extraction based Ensemble Data Clustering (FEEDC) approach. It also includes advanced dimensionality reduction techniques and feature extraction methods, as well as MapReduce. The centroid selection is done using a support vector machine classifier during clustering. The ensemble member selection algorithm is the firefly algorithm. Finally, a clustering solution with the Normalized cut (Ncut) algorithm is used to diagnose conditions such as heart disease and breast cancer at the initial phase. The results are obtained using two UCI datasets, which achieves more accurate results.KeywordsDimensionality reductionHigh dimensional dataFeature extractionHealthcareEvolutionary algorithms

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call