Abstract
The speed of the diagnosis process is vital in pursuing the trial of curing cancer. During the last decade, precision medicine evolved by detecting different types of cancer through microarrays (MA) of deoxyribonucleic acid (DNA) processed by machine learning (ML) algorithms. Personalized diagnosis, followed by personalized treatment, should imply personalized hyperparameters of the ML. The goal of this paper is to propose a novel adaptive ML method that embeds knowledge into the architecture of the algorithm and also filters the features in order to reduce their number, increase computational speed, and decrease computational cost and time. fLogSLFN is a novel two-fold theoretically effective ML that can be used in two-class decision problems that embeds the logistic regression in such a manner that the hidden nodes of a single-hidden layer feedforward neural network (SLFN) are problem dependent. A filtering module based on the significance of each attribute is embedded in order to avoid the 'curse of dimensionality' phenomenon. The proposed model has been tested on three publicly available high-dimensional cancer datasets that contain gene expressions provided by complementary DNA (cDNA) array, and DNA microarray. The proposed novel method filtered logistic SLFN (fLogSLFN) has been also compared and statistically benchmarked to four ML algorithms: extreme learning machine (ELM), radial basis function network (RBF), single-hidden layer feedforward neural network trained by the backpropagation algorithm (BPNN), logistic regression with the LASSO penalty, and the adaptive single-hidden layer feedforward network (aSLFN). The experimental results showed that the fLogSLFN is competitive to the other state-of-the-art models, obtaining accuracies between 64.70% and 98.66% depending on the dataset it had been applied on. In contrast to other state-of-the-art ML algorithms, the fLogSLFN is capable to embed the knowledge extracted from the data into its architecture, making it problem dependent. The filtering module increases its computational speed, while decreasing computational cost and time. The statistical analysis revealed the fact that by filtering the features the performance is kept, making the algorithm more efficient.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have