Abstract

AbstractA form of cancer known as lung cancer usually appears in the cells that line the lungs' air passages. It is the primary cause of cancer-related deaths in both genders. Lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are two separate types of non-small cell lung cancer (NSCLC). Recognition of the differentially expressed genes (DEGs) is necessary to distinguish between healthy and diseased states molecularly. Biomedical books and microarray investigations are two common places to find details concerning gene expression variations. A gene is said to be highly expressed if there is a significant statistical distinction or change in reading counts or expression level indices between two different experimental conditions. A good gene expression strategy leads to a good gene silencing strategy, which subsequently slows or stops the cancer progression. In this paper, we present a robust analysis of the mRNA data from the LUAD data to find the most differentially expressed genes. Then we employ several machine learning techniques to categorize the samples into the five stages of lung cancer. We investigated four distinct models; including linear regression, logistic regression, RANSAC regression, and decision trees. The models achieved 100%, 91%, 98%, and 100% accuracies for 12 differential expression genes using linear regression, logistic regression, RANSAC regression, and decision trees respectively. The logistic regression and decision trees achieved 100% and 98% accuracies for 5665 differential expression genes. The proposed models achieved the highest accuracies in comparison with the state-of-the-art models.KeywordsGene expressionDEGsLUADLung cancerClassificationMachine learningDecision treeRANSACLogistic regressionLinear regression

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call