BackgroundHomeobox (HOX) family consists of 39 genes which act as master regulators in embryonic development. Each of the genes is also known to play key roles in progression of breast cancer, including epithelial to mesenchymal transition, tumor angiogenesis and endocrine therapy resistance. Although there are numerous reports on individual HOX genes and cancer, none of them have comprehensively analyzed the whole gene family. Since HOX genes strongly coordinate within the family during the embryonic period, we considered that the analysis of the whole HOX family is also indispensable in breast cancer. MethodsWe collected 702 breast cancer data from four publicly available array datasets (GSE11121, GSE7390, GSE3494, GSE2990) and performed unsupervised hierarchal clustering into two clusters by the expression of HOX genes. We constructed model formulas for cluster prediction by dividing the samples into learning and validation groups. We used three machine learning methods: support-vector machine (SVM), neural network and Bayes. The model formulas were validated by validation samples. We also used 512 TCGA breast cancer data to calculate covariations of the genes in breast cancer. ResultsBy the clustering of four array datasets, the DFS of the two clusters in PAM50-classified luminal B patients were statistically different (p=0.016), and the gene ontology analysis revealed that the Wnt pathway was activated in the poor prognostic cluster. All cluster prediction models for luminal B sample achieved accuracies of over 90%. From TCGA breast cancer data, we found that HOX genes covariate the most with other HOX genes, especially within the chromosomally proximal groups. ConclusionsComprehensive analysis of the whole HOX family lead to the prediction of luminal B breast cancer prognosis. Considering that Wnt signaling controls HOX genes during the embryonic stage, we suppose a Wnt pathway activated, poor prognostic subgroup in luminal B breast cancer which can be identified by the expression of HOX genes. The cluster prediction model by machine learning was acceptable for its future adaptation in clinical settings. We also proved that HOX genes strongly covariate within the gene family in cancer, not only during the embryonic stage. Legal entity responsible for the studyThe authors. FundingHas not received any funding. DisclosureAll authors have declared no conflicts of interest.
Read full abstract