Abstract
Identifying differentially expressed genes (DEGs) between tumor and normal samples is critical for studying tumorigenesis, and has been routinely applied to identify diagnostic, prognostic, and therapeutic biomarkers for many cancers. It is well-known that solid tumor tissue samples obtained from clinical settings are always mixtures of cancer and normal cells. However, the tumor purity information is more or less ignored in traditional differential expression analyses, which might decrease the power of differential gene identification or even bias the results. In this paper, we have developed a novel differential gene calling method called DECtp by integrating tumor purity information into a generalized least square procedure, followed by the Wald test. We compared DECtp with popular methods like t-test and limma on nine simulation datasets with different sample sizes and noise levels. DECtp achieved the highest area under curves (AUCs) for all the comparisons, suggesting that cancer purity information is critical for DEG calling between tumor and normal samples. In addition, we applied DECtp into cancer and normal samples of 14 tumor types collected from The Cancer Genome Atlas (TCGA) and compared the DEGs with those called by limma. As a result, DECtp achieved more sensitive, consistent, and biologically meaningful results and identified a few novel DEGs for further experimental validation.
Highlights
Nowadays, RNA sequencing (RNA-Seq) has become a routine for measuring RNA expression levels (Mortazavi et al, 2008; Wang et al, 2009)
We applied and compared DECtp with canonical differentially expressed genes (DEGs) calling algorithms like limma on a few simulated datasets and cancer datasets downloaded from The Cancer Genome Atlas
Through extensive analyses of the The Cancer Genome Atlas (TCGA) data, we discovered that the expression levels of many genes have strong correlation with tumor purity in cancer and the correlation increases with the difference of gene expressions between cancer and normal samples
Summary
RNA sequencing (RNA-Seq) has become a routine for measuring RNA expression levels (Mortazavi et al, 2008; Wang et al, 2009). We have developed a novel method called Differential Expression Caller by combining tumor purity information (DECtp) to identify DEGs between tumor and normal samples. DECtp models expression profiles of tumor samples as a mixed Gaussian distribution, where the mixing proportion is tumor purity.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.