Abstract

Background: AML is a highly heterogeneous disease with great diversity in clinical features and patient response to treatments. Despite recent improvements in disease understanding, treatments have remained unchanged for 30 years. This presents the need to characterise the disease more fully and characterise its underpinning molecular subtypes. The re-use of published and validated prognostic and predictive gene signatures e.g. Cancer Hallmarks presents an invaluable in silico opportunity to uncover the biological mechanisms underpinning treatment response in AML. Aims: To develop an automated statistical analysis pipeline combined with machine learning techniques to derive robust patient clusters representative of novel molecular AML subtypes significantly associated with clinical variables and survival outcomes. Methods: Analysis was carried out using a primary dataset (AML-OHSU: 451 AML patient samples) and a validation dataset (TCGA-LAML: 200 AML patient samples). Both datasets had been processed using Almac’s claraT platform, a software-driven solution which provides a comprehensive overview of tumour profiles using gene expression signatures. An automated analytical pipeline was developed using Consensus Clustering (CC), a method that determines the number and membership of potential clusters, using different combinations of 8 distances and 7 linkages within a dataset. Robustness was tested via bootstrapping. This pipeline was used to stratify patients via a dimension reduction approach whereby clustering was performed on 210 gene signatures categorised by 10 different hallmarks of cancer. Analysis was performed to identify clinical associations within robust clusters that were linked to differences in survival. Results: The automated clustering pipeline analysed a total of 1,314 stable clusters across 10 cancer hallmarks in the AML-OHSU dataset. Stable clusters were subsequently processed via log rank analysis (OS right-censored at 60 months) identifying 134 stable clusters with significant differences (p-value <0.05) in survival outcome. Stable clusters with significant survival differences were tested against 32 clinical categorical variables present in the AML-OHSU dataset. The results were filtered for a significant threshold (chi square p-value <0.05 and BH p-value <0.2). Here we found gene signatures representative of the Energetics hallmark, incorporating 22 signatures, to be one of the most frequently clustered throughout our results, ranking highest where K=3. A significant difference in overall survival probability (Log rank p-value: 0.033) was found between clusters (Energetics, K=3). Patients in the poorest survival cluster were characterised by a refectory induction response to treatment, having the lowest number of fusions, a low frequency of NPM1 mutations and a high proportion of patients above the age of 65. To validate energetics results from the AML-OHSU dataset, CC was again performed using gene signatures from the TCGA dataset that were representative of the energetics hallmark. A significant difference between overall survival probability (Log rank p-value: 0.019) was again found between stable clusters of the energetics hallmark (K = 3). Summary/Conclusion: We have demonstrated that a novel analytical pipeline developed here to analyse Hallmark-related gene signatures can aid in the discovery of new molecular subtypes in AML associated with prognosis. We have subsequently validated these in an independent dataset. The Energetics hallmark has not, to our knowledge, been linked with AML prognosis before, and may suggest novel biology linked to treatment response.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call