Abstract

Genes involved in cancer are under constant evolutionary pressure, potentially resulting in diverse molecular properties. In this study, we explore 23 omic features from publicly available databases to define the molecular profile of different classes of cancer genes. Cancer genes were grouped according to mutational landscape (germline and somatically mutated genes), role in cancer initiation (cancer driver genes) or cancer survival (survival genes), as well as being implicated by genome-wide association studies (GWAS genes). For each gene, we also computed feature scores based on all omic features, effectively summarizing how closely a gene resembles cancer genes of the respective class. In general, cancer genes are longer, have a lower GC content, have more isoforms with shorter exons, are expressed in more tissues and have more transcription factor binding sites than non-cancer genes. We found that germline genes more closely resemble single tissue GWAS genes while somatic genes are more similar to pleiotropic cancer GWAS genes. As a proof-of-principle, we utilized aggregated feature scores to prioritize genes in breast cancer GWAS loci and found that top ranking genes were enriched in cancer related pathways. In conclusion, we have identified multiple omic features associated with different classes of cancer genes, which can assist prioritization of genes in cancer gene discovery.

Highlights

  • Multiple algorithms have been developed which implicate genes according to mutational load, molecular function, involvement in specific pathways or e­ xpression[15,16,17,18,19,20]

  • We included 43 genes which are often harbouring rare cancer predisposition mutations, 457 genes frequently mutated in tumours, 106 cancer driver genes as identified by Bailey et al ­20182, 1268 genes whose expression levels are associated with cancer mortality as well as 2373 genes located in cancer GWAS loci (GWAS cancer genes, 901 pleiotropic and 1472 nonpleiotropic)

  • We evaluated the association of 23 omic features (Fig. 1) with cancer gene status and found multiple statistically significant correlations (Fig. 2 and Supplementary Table 1)

Read more

Summary

Introduction

Multiple algorithms have been developed which implicate genes according to mutational load, molecular function, involvement in specific pathways or e­ xpression[15,16,17,18,19,20]. Understanding the molecular characteristics of typical cancer genes promises to allow the prioritization of genes within those regions by implicating those genes which most closely resemble other typical cancer genes. To further characterize the molecular properties of cancer genes, we systematically investigate the multiple omic features of different classes of cancer genes. We aggregate the effect of those features to rank genes within breast cancer GWAS regions and perform pathway enrichment on genes to illustrate the utility of our findings

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.