Abstract

Cancer is one of the most feared and aggressive diseases in the world and is responsible for more than 9 million deaths universally. Staging cancer early increases the chances of recovery. One staging technique is RNA sequence analysis. Recent advances in the efficiency and accuracy of artificial intelligence techniques and optimization algorithms have facilitated the analysis of human genomics. This paper introduces a novel optimized deep learning approach based on binary particle swarm optimization with decision tree (BPSO-DT) and convolutional neural network (CNN) to classify different types of cancer based on tumor RNA sequence (RNA-Seq) gene expression data. The cancer types that will be investigated in this research are kidney renal clear cell carcinoma (KIRC), breast invasive carcinoma (BRCA), lung squamous cell carcinoma (LUSC), lung adenocarcinoma (LUAD) and uterine corpus endometrial carcinoma (UCEC). The proposed approach consists of three phases. The first phase is preprocessing, which at first optimize the high-dimensional RNA-seq to select only optimal features using BPSO-DT and then, converts the optimized RNA-Seq to 2D images. The second phase is augmentation, which increases the original dataset of 2086 samples to be 5 times larger. The selection of the augmentations techniques was based achieving the least impact on manipulating the features of the images. This phase helps to overcome the overfitting problem and trains the model to achieve better accuracy. The third phase is deep CNN architecture. In this phase, an architecture of two main convolutional layers for featured extraction and two fully connected layers is introduced to classify the 5 different types of cancer according to the availability of images on the dataset. The results and the performance metrics such as recall, precision and F1 score show that the proposed approach achieved an overall testing accuracy of 96.90%. The comparative results are introduced, and the proposed method outperforms those in related works in terms of testing accuracy for 5 classes of cancer. Moreover, the proposed approach is less complex and consume less memory.

Highlights

  • Cancer Cancer is a general term that used to describe a group of diseases associated with abnormal cell growth with metastatic and invasive characteristics [1]

  • This paper proposes an optimized deep learning approach based on binary particle swarm optimization with decision tree (BPSO-decision tree (DT)) and convolutional neural network (CNN) to classify normal and tumor conditions depending on a high-dimensional RNA sequence (RNA-Seq) gene expression data

  • TESTING ACCURACY MEASUREMENT To measure the accuracy of the proposed architecture for tumor gene expression using deep convolutional neural networks, 5 different trials were performed, and the median accuracy was calculated for different training and testing splitting

Read more

Summary

Introduction

Cancer Cancer is a general term that used to describe a group of diseases associated with abnormal cell growth with metastatic and invasive characteristics [1]. In 2018, cancer was responsible for more than 9 million deaths worldwide. 17% of females and 20% of males will have cancer at some point in time, and 10% of females and 13% of. Males will die from it [2]. Based on statistics from the WHO, every year, more than 8 million people die from cancer, accounting for approximately 13% of deaths worldwide, indicating that cancer is one of the most threatening diseases in the world [1]. In 2018, lung cancer (1.76 million deaths) and colorectal cancer (860,000) are recorded as the most common cancers. Stomach cancer (780,000), liver cancer (780,000), and breast cancer (620,000) ranked second, third and fourth among the most common cancers [2]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call