Abstract

BackgroundResearches have been conducted for the identification of differentially expressed genes (DEGs) by generating and mining of cDNA expressed sequence tags (ESTs) for more than a decade. Although the availability of public databases make possible the comprehensive mining of DEGs among the ESTs from multiple tissue types, existing studies usually employed statistics suitable only for two categories. Multi-class test has been developed to enable the finding of tissue specific genes, but subsequent search for cancer genes involves separate two-category test only on the ESTs of the tissue of interest. This constricts the amount of data used. On the other hand, simple pooling of cancer and normal genes from multiple tissue types runs the risk of Simpson's paradox. Here we presented a different approach which searched for multi-cancer DEG candidates by analyzing all pertinent ESTs in all categories and narrowing down the cancer biomarker candidates via integrative analysis with microarray data and selection of secretory and membrane protein genes as well as incorporation of network analysis. Finally, the differential expression patterns of three selected cancer biomarker candidates were confirmed by real-time qPCR analysis.ResultsSeven hundred and twenty three primary DEG candidates (p-value < 0.05 and lower bound of confidence interval of odds ratio ≧ 1.65) were selected from a curated EST database with the application of Cochran-Mantel-Haenszel statistic (CMH). GeneGO analysis results indicated this set as neoplasm enriched. Cross-examination with microarray data further narrowed the list down to 235 genes, among which 96 had membrane or secretory annotations. After examined the candidates in protein interaction network, public tissue expression databases, and literatures, we selected three genes for further evaluation by real-time qPCR with eight major normal and cancer tissues. The higher-than-normal tissue expression of COL3A1, DLG3, and RNF43 in some of the cancer tissues is in agreement with our in silico predictions.ConclusionsSearching digitized transcriptome using CMH enabled us to identify multi-cancer differentially expressed gene candidates. Our methodology demonstrated simultaneously analysis for cancer biomarkers of multiple tissue types with the EST data. With the revived interest in digitizing the transcriptomes by NGS, cancer biomarkers could be more precisely detected from the ESTs. The three candidates identified in this study, COL3A1, DLG3, and RNF43, are valuable targets for further evaluation with a larger sample size of normal and cancer tissue or serum samples.

Highlights

  • Researches have been conducted for the identification of differentially expressed genes (DEGs) by generating and mining of cDNA expressed sequence tags (ESTs) for more than a decade

  • Since exploring meta-analysis from traditional digital expression data such as EST derived from cDNAs [5,6,7,8] is more feasible, this study may serve as a precursor to more complicated experiments

  • We report on the application of a computational and integrative approach to analyze cancer differentially expressed genes (DEGs)

Read more

Summary

Introduction

Researches have been conducted for the identification of differentially expressed genes (DEGs) by generating and mining of cDNA expressed sequence tags (ESTs) for more than a decade. The identification of genes differentially expressed between normal and cancer cells/tissues is helpful for designing diagnostic and therapeutic procedures, and for understanding cancer biology as a whole In this regard, DNA microarrays have been the dominating platform in the high-throughput study of cancer transcriptomes since their emergence in the mid-1990s [1,2]. There are several drawbacks, which include: high background level signals resulting from cross-hybridization [3,4]; difference in hybridization properties due to different probe sequences; limited dynamic range due to background level and saturation, and difficulty in detecting splicing isoforms and unknown genes For these reasons, with the advancement of the generation sequencers, we are seeing high-throughput transcriptome mapping and quantifying method, known as RNA-Seq, to begin to supersede microarray in expression profiling. Since exploring meta-analysis from traditional digital expression data such as EST derived from cDNAs [5,6,7,8] is more feasible, this study may serve as a precursor to more complicated experiments

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.