Abstract

BackgroundGenomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer. However, the relative contribution of specific alterations to the pathology of different breast cancer subtypes remains unclear. The heterogeneity and interplay of genomic and epigenetic variations means that large datasets and statistical data mining methods are required to uncover recurrent patterns that are likely to be important in cancer progression.ResultsWe employed ridge regression to model the relationship between regional changes in gene expression and proliferation. Regional features were extracted from tumour gene expression data using a novel clustering method, called genomic distance entrained agglomerative (GDEC) clustering. Using gene expression data in this way provides a simple means of integrating the phenotypic effects of both copy number aberrations and alterations in chromatin state. We show that regional metagenes derived from GDEC clustering are representative of recurrent regions of epigenetic regulation or copy number aberrations in breast cancer. Furthermore, detected patterns of genomic alterations are conserved across independent oestrogen receptor positive breast cancer datasets. Sequential competitive metagene selection was used to reveal the relative importance of genomic regions in predicting proliferation rate. The predictive model suggested additive interactions between the most informative regions such as 8p22-12 and 8q13-22.ConclusionsData-mining of large-scale microarray gene expression datasets can reveal regional clusters of co-ordinate gene expression, independent of cause. By correlating these clusters with tumour proliferation we have identified a number of genomic regions that act together to promote proliferation in ER+ breast cancer. Identification of such regions should enable prioritisation of genomic regions for combinatorial functional studies to pinpoint the key genes and interactions contributing to tumourigenicity.

Highlights

  • Genomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer

  • Genomic distance entrained clustering We have developed a novel clustering method, called Genomic Distance (GDEC) Entrained Clustering, to identify genomic regions where gene expression is coordinately altered

  • To establish the effect of GDEC clustering we compared the chromosomal composition of clusters derived from GDEC and standard flexible beta clustering, using ER+ samples from three published breast cancer gene expression datasets [11,16,17,18,19]

Read more

Summary

Introduction

Genomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer. Regional epigenetic changes involving DNA methylation and chromatin structure which lead to or stabilize altered gene expression have been shown to be involved in breast cancer [6]. The interplay of alterations in DNA copy number and epigenetic states is complex, and to understand the full picture data from multiple sources needs to be integrated. Since both copy number and epigenetic alterations result in changes in gene expression patterns, analysis of microarray gene expression data in the context of specific genomic regions is an efficient means of integrating the effects of genomic changes in cancer

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.