Abstract

BackgroundMetastatic breast cancer is a leading cause of cancer-related deaths in women worldwide. DNA microarray has become an important tool to help identify biomarker genes for improving the prognosis of breast cancer. Recently, it was shown that pathway-level relationships between genes can be incorporated to build more robust classification models and to obtain more useful biological insight from such models. Due to the unavailability of complete pathways, protein-protein interaction (PPI) network is becoming more popular to researcher and opens a new way to investigate the developmental process of breast cancer.MethodsIn this study, a network-based method is proposed to combine microarray gene expression profiles and PPI network for biomarker discovery for breast cancer metastasis. The key idea in our approach is to identify a small number of genes to connect differentially expressed genes into a single component in a PPI network; these intermediate genes contain important information about the pathways involved in metastasis and have a high probability of being biomarkers.ResultsWe applied this approach on two breast cancer microarray datasets, and for both cases we identified significant numbers of well-known biomarker genes for breast cancer metastasis. Those selected genes are significantly enriched with biological processes and pathways related to cancer carcinogenic process, and, importantly, have much higher stability across different datasets than in previous studies. Furthermore, our selected genes significantly increased cross-data classification accuracy of breast cancer metastasis.ConclusionsThe randomized Steiner tree based approach described in this study is a new way to discover biomarker genes for breast cancer, and improves the prediction accuracy of metastasis. Though the analysis is limited here only to breast cancer, it can be easily applied to other diseases.

Highlights

  • Metastatic breast cancer is a leading cause of cancer-related deaths in women worldwide

  • Applying our approach on three breast cancer datasets, we found that the candidate markers selected by our method are highly enriched in pathways that are well-known to be dysregulated in breast cancer metastasis, and cover a significant number of known breast cancer susceptibility genes

  • For this we find the common genes in Steiner treebased marker (STM) for the two datasets with two protein-protein interaction (PPI) networks and compare with previous studies

Read more

Summary

Introduction

Metastatic breast cancer is a leading cause of cancer-related deaths in women worldwide. Many studies have used gene expression data for marker identification in breast cancer and other diseases [1,2]. The problem of pathway-based approach, is that the majority of human genes are not assigned to a specific pathway [10]; there is a strong possibility that a true marker may be out of consideration for not being assigned to a pathway To circumvent this problem, [10] proposed to incorporate protein-protein interaction (PPI) networks for discovering small sub-networks, which may represent novel pathways, as potential markers. They found that such subnetwork-based markers can both improve classification accuracy and increase cross-dataset stability. [13] found that inter-modular hubs are more associated with breast cancer than intra-modular hubs; [14] used pair-wise shortest paths between differentially expressed genes to identify candidate markers, [15] used probabilistic activity inference method to identify diagnostic subnetworks

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call