Abstract

For accurate gene expression quantification, normalization of gene expression data against reliable reference genes is required. It is known that the expression levels of commonly used reference genes vary considerably under different experimental conditions, and therefore, their use for data normalization is limited. In this study, an unbiased identification of reference genes in Caenorhabditis elegans was performed based on 145 microarray datasets (2296 gene array samples) covering different developmental stages, different tissues, drug treatments, lifestyle, and various stresses. As a result, thirteen housekeeping genes (rps-23, rps-26, rps-27, rps-16, rps-2, rps-4, rps-17, rpl-24.1, rpl-27, rpl-33, rpl-36, rpl-35, and rpl-15) with enhanced stability were comprehensively identified by using six popular normalization algorithms and RankAggreg method. Functional enrichment analysis revealed that these genes were significantly overrepresented in GO terms or KEGG pathways related to ribosomes. Validation analysis using recently published datasets revealed that the expressions of newly identified candidate reference genes were more stable than the commonly used reference genes. Based on the results, we recommended using rpl-33 and rps-26 as the optimal reference genes for microarray and rps-2 and rps-4 for RNA-sequencing data validation. More importantly, the most stable rps-23 should be a promising reference gene for both data types. This study, for the first time, successfully displays a large-scale microarray data driven genome-wide identification of stable reference genes for normalizing gene expression data and provides a potential guideline on the selection of universal internal reference genes in C. elegans, for quantitative gene expression analysis.

Highlights

  • Genome-wide expression analysis has always played a crucial role in the field of the functional genome

  • 13 housekeeping gene (HKG) candidates were identified and are listed in Table 1, of which rps-23 was shared by SGLR, SGLM, SGLL, and SGLP, while rps-27, rps-16, rps-26, rps-4, rps-2, rps-17, rpl-24.1, rpl-15, rpl-35, rpl-36, rpl-27, and rpl-33 were shared by SGLR, SGLL, and SGLP (Figure 2). These results show that 32 genes were shared by two second-round ranked gene list (SGL)

  • 13 HKG candidates potentially used as reference genes for gene expression analysis in C. elegans were identified by large-scale data integration and systematic analysis

Read more

Summary

Introduction

Genome-wide expression analysis has always played a crucial role in the field of the functional genome. Quantitative real-time PCR (qPCR) has been widely used for validating gene expression data due to its high sensitivity, rapid execution and specificity [2,3]. Cells 2020, 9, 786 different spatial-temporal conditions, reference genes are widely used as internal controls to minimize the misinterpretation of expression data. Some reports indicated that the transcription levels of these conserved reference genes may be changed under different conditions, such as developmental stages, drug treatments, and hypoxia [22,23]. Selecting such biased reference genes can lead to misinterpreting qPCR results, and output misleading expression data

Objectives
Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call