Abstract

RNA-Seq has made significant contributions to various fields, particularly in cancer research. Recent studies on differential gene expression analysis and the discovery of novel cancer biomarkers have extensively used RNA-Seq data. New biomarker identification is essential for moving cancer research forward, and early cancer diagnosis improves patients' chances of recovery and increases life expectancy. There is an urgency and scope of improvement in both sections. In this paper, we developed an autoencoder-based biomarker identification method by reversing the learning mechanism of the trained encoders. We devised an explainable post hoc methodology for identifying influential genes with a high likelihood of becoming biomarkers. We applied recursive feature elimination to shorten the list further and presented a list of 17 potential biomarkers that are 99.93% accurate in identifying cancer types using support vector machine for the UCI gene expression cancer RNA-Seq dataset consisting of five cancerous tumor types. Our methodology outperforms all of the state-of-the-art methods, confirming the potential of the newly identified biomarkers as well as the efficacy of the biomarker identification procedure. Moreover, we have evaluated the performance of our methodology using six independent RNA-Seq gene expression datasets for several tasks, i.e., classification of tumors from non-tumors, detecting the origin of circulating tumor cells (CTCs), and predicting if metastasis occurs or not. Our methodology achieved stimulating results for these tasks as well. The source code of this project is available at https://github.com/fuad021/biomarker-identification.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.