Abstract

Over the past two decades, lung cancer has been a dominant malignant form of cancer. Around 80% of major lung cancers are non-small cell lung carcinoma (NSCLC). NSCLC is the major reason for death from a malignant disease worldwide. As a result, there is urgent interest in the improvement of innovative diagnostic noninvasive technologies that may enhance early diagnosis of the disease. One of the most promising techniques for early detection of cancerous cells depends on machine learning based on molecular cancer classification using gene expression profiling data. Current technological breakthroughs in gene expression profiling, specifically with DNA and oligonucleotide microarrays, permit the concomitant analysis of the expression of thousands of genes and also enables surveillance of disease prediction and progression of patient survival at the molecular level. For this reason, we attempted to come up with a machine-learning-based strategy called composite hypercubes on iterated random projections (CHIRP) in order to settle the problem of detection of early NSCLC from DNA biochip gene expression data. Furthermore, we utilized an unsupervised dimensionality reduction approach, named t-distributed stochastic neighbor embedding (T-SNE), to reduce computational complexity and to increase the efficiency of the developed system. The average accuracy obtained by the proposed system in terms of detection and diagnosis of early non-small cell lung cancer was 97.21871%. The empirical results prove that the combination of dimensionality reduction models with machine-learning algorithms can be effectively used for early detection of specific NSCLC tumor type.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call