Abstract

BackgroundAppropriate reference genes are critical to accurately quantifying relative gene expression in research and clinical applications. Numerous efforts have been made to select the most stable reference gene(s), but a consensus has yet to be achieved. In this report, we propose an in silico reference gene validation method, iRGvalid, that can be used as a universal tool to validate the reference genes recommended from different resources so as to identify the best ones without a need for any wet lab validation tests.MethodsiRGvalid takes advantage of high throughput gene expression data and is built on a double-normalization strategy. First, the expression level of each individual gene is normalized against the total gene expression level of each sample, followed by a target gene normalization to the candidate reference gene(s). Linear regression analysis is then performed between the pre- and post- normalized target gene across the whole sample set to evaluate the stability of the reference gene(s), which is positively associated with the Pearson correlation coefficient, Rt. The higher the Rt value, the more stable the reference gene. We applied iRGvalid to 14 candidate reference genes to validate and identify the most stable reference genes in four cancer types: lung adenocarcinoma, breast cancer, colon adenocarcinoma, and nasopharyngeal cancer. The stability of the reference gene is evaluated both individually and in groups of all possible combinations.ResultsHighly stable reference genes resulted in high Rt values regardless of the target gene used. The highest stability was achieved with a specific combination of 3 to 6 reference genes. A few genes were among the best reference genes across the cancer types studied here.ConclusioniRGvalid provides an easy and robust method to validate and identify the most stable reference gene or genes from a pool of candidate reference genes. The inclusivity of large expression data sets as well as the direct comparison of candidate reference genes makes it possible to identify reference genes with universal quality. This method can be used in any other gene expression studies when large cohorts of expression data are available.

Highlights

  • As an important biomarker source, gene expression has been one of the major focuses of cancer genome studies

  • In 2019, two groups independently published their work on the selection of pan-cancer reference genes using RNA-Seq data from hundreds of cancerous and matched normal tissue samples across all cancer types, primarily in the Cancer Genome Atlas (TCGA) database (Jo et al, 2019; Krasnov et al, 2019)

  • By taking advantage of RNA-Seq data from the TCGA database, we developed an easy and robust in silico reference gene validation method, iRGvalid, and used this method to validate the reference genes recommended by two studies mentioned above (Jo et al, 2019; Krasnov et al, 2019) as well as those selected in-house from the TCGA database

Read more

Summary

Introduction

As an important biomarker source, gene expression has been one of the major focuses of cancer genome studies. Appropriate reference genes are critical to accurately quantifying relative expression levels. Numerous studies have been performed to identify the most stable reference genes in different tissues or cells (Vandesompele et al, 2002; Andersen et al, 2004; Pfaffl et al, 2004), but a consensus has yet to be achieved (Wang et al, 2012; Jacob et al, 2013). Popovici et al (2009) used microarray data from 10 cohorts of breast cancer studies and identified the 50 most stably expressed genes. Tilli et al (2016) later obtained 10 novel reference genes from 6 breast cancer cell lines using both transcriptome and microarray data from several databases. We propose an in silico reference gene validation method, iRGvalid, that can be used as a universal tool to validate the reference genes recommended from different resources so as to identify the best ones without a need for any wet lab validation tests

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call