The microRNA (miRNA) biomolecules have a significant role in the development of breast cancer, and their expression profiles are different in each subtype of breast cancer. Thus, our goal is to use the Next Generation Sequencing provided high-throughput miRNA expression and clinical data in an integrated fashion to perform survival analysis in order to identify breast cancer subtype specific miRNAs, and analyze associated genes and transcription factors. We select top 100 miRNAs for each of the four subtypes, based on the value of hazard ratio and p-value, thereafter, identify 44 miRNAs that are related to all four subtypes, which we call as four-star miRNAs. Moreover, 12, 14, 9, and 15 subtype specific, viz. one-star miRNAs, are also identified. The resulting miRNAs are validated by using machine learning methods to differentiate tumor cases from controls (for four-star miRNAs), and subtypes (for one-star miRNAs). The four-star miRNAs provide 95% average accuracy, while in case of one-star miRNAs 81% accuracy is achieved for HER2-Enriched. Differences in expression of miRNAs between cancer stages is also analyzed, and a subset of eight miRNAs is found, for which expression is increased in stage II relative to stage I, including hsa-miR-10b-5p, which contributes to breast cancer metastasis. Subsequently we prepare regulatory networks in order to identify the interactions among miRNAs, their targeted genes and transcription factors (TFs), that are targeting those miRNAs. In this way, key regulatory circuits are identified, where genes such as TP53, ESR1, BRCA1, MYC, and others, that are known to be important genetic factors for the cause of breast cancer, produce transcription factors that target the same genes as well as interact with the selected miRNAs. To provide further biological validation the Protein-Protein Interaction (PPI) networks are prepared and Kyoto Encyclopedia of Genes and Genomes pathway and gene ontology (GO) enrichment analysis are performed. Among the enriched pathways many are breast cancer-related, such as PI3K-Akt or p53 signaling pathways, and contain proteins such as TP53, also present in the regulatory networks. Moreover, we find that the genes are enriched in GO terms associated with breast cancer. Our results provide detailed analysis of selected miRNAs and their regulatory networks.
Read full abstract