Abstract

BackgroundThe molecular characteristics of human diseases are often represented by a list of genes termed “signature genes”. A significant challenge facing this approach is that of reproducibility: signatures developed on a set of patients may fail to perform well on different sets of patients. As diseases are resulted from perturbed cellular functions, irrespective of the particular genes that contribute to the function, it may be more appropriate to characterize diseases based on these perturbed cellular functions.MethodsWe proposed a profile-based approach to characterize a disease using a binary vector whose elements indicate whether a given function is perturbed based on the enrichment analysis of expression data between normal and tumor tissues. Using breast cancer and its four primary clinically relevant subtypes as examples, this approach is evaluated based on the reproducibility, accuracy and resolution of the resulting pathway profiles.ResultsPathway profiles for breast cancer and its subtypes are constructed based on data obtained from microarray and RNA-Seq data sets provided by The Cancer Genome Atlas (TCGA), and an additional microarray data set provided by The European Genome-phenome Archive (EGA). An average reproducibility of 68% is achieved between different data sets (TCGA microarray vs. EGA microarray data) and 67% average reproducibility is achieved between different technologies (TCGA microarray vs. TCGA RNA-Seq data). Among the enriched pathways, 74% of them are known to be associated with breast cancer or other cancers. About 40% of the identified pathways are enriched in all four subtypes, with 4, 2, 4, and 7 pathways enriched only in luminal A, luminal B, triple-negative, and HER2+ subtypes, respectively. Comparison of profiles between subtypes, as well as other diseases, shows that luminal A and luminal B subtypes are more similar to the HER2+ subtype than to the triple-negative subtype, and subtypes of breast cancer are more likely to be closer to each other than to other diseases.ConclusionsOur results demonstrate that pathway profiles can successfully characterize both common and distinct functional characteristics of four subtypes of breast cancer and other related diseases, with acceptable reproducibility, high accuracy and reasonable resolution.

Highlights

  • The molecular characteristics of human diseases are often represented by a list of genes termed “signature genes”

  • Using breast cancer and four clinically-relevant subdivisions as examples, we examine the new approach from three perspectives: to determine whether the pathway profile can be reproduced from the data generated by different technologies (Microarray vs. RNA-Seq), as well as from separate cohorts (The Cancer Genome Atlas (TCGA) vs. The European Genome-phenome Archive (EGA)), to determine whether the resulting pathways are associated with the functional perturbation resulted from the breast cancer and its subtypes, and to determine whether the pathway profile can distinguish different subtypes of breast cancer as well as distinguish breast cancer from other diseases

  • Our results indicate that the new approach achieves 68% average reproducibility between different data sets (TCGA microarray vs. EGA microarray data) and 67% average reproducibility between different technologies (TCGA microarray vs. TCGA RNA-Seq data)

Read more

Summary

Introduction

The molecular characteristics of human diseases are often represented by a list of genes termed “signature genes”. It is reasonable to assume that any gene whose change of expression leads to the perturbed molecular function may be a potential signature gene This assumption is partially evidenced by the fact that both gene signatures developed in [18] and [19] can capture cell proliferation related biological processes and pathways [20], and that dysregulations of functionally related genes result in similar clinical phenotypes [21]. This assumption may explain why sophisticated methods can rarely find much better gene signatures than simple methods [22]. It may be more appropriate to characterize diseases at the functional level

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.