Acute myeloid leukemia (AML) is an aggressive hematopoietic malignancy with a poor overall survival rate. AML is a highly heterogeneous disease driven by combinations of genomic mutations, epigenetic alterations, and biochemical signaling. The heterogeneity underlying AML needs novel approaches for individualized clinical assessment and treatment selection. However, approaches that rely on differential expression or correlation analysis lack an underlying theoretical model of the disease to inform the interpretation of ongoing processes in each individual which is crucial for personalized diagnostics and treatment selection. We have previously shown that the state-transition (ST) theory, state-space could predict AML disease evolution using the mRNA of a conditional Cbfb-MYH11 (CM) knock-in mouse model. The ST model represents AML evolution from health to disease as a state transition of the transcriptome state described as a particle undergoing Brownian motion in a double-well quasi-potential, where critical points define states of healthy, c1, transition to AML, c2, and overt AML, c3. We integrated two approaches, the ST model and surprisal analysis (SA) from thermodynamics, to create a patient-specific characterization that can be used to identify the individualized state of the disease. The publicly available datasets of human blood including normal samples, AML samples of bone marrow (BM), and peripheral blood (PB) were utilized for the methods. A total of 858 samples from three publicly available RNA-seq datasets were used. The Therapeutically Applicable Research to Generate Effective Treatments (TARGET) dataset is a pediatric study, from which we analyzed 84 normal samples and 126 AML inv(16) samples, consisting of 105 BM and 21 PB samples with ages from 0.3 to 28 years. Two additional datasets from the BEATAML study and the Cancer Genome Atlas (TCGA), were downloaded from the Genomic Data Commons (GDC) portal using GDCRNATools. The BEATAML dataset included 21 normal and 476 AML samples consisting of 226 PB and 243 BM samples with ages from 2 to 87, and TCGA dataset contained 151 AML PB samples, with ages 21 to 88 years. We used a pseudotime approach to infer disease dynamics for ST using KIT and CD33. We found via t-test analysis that KIT expression between normal and AML samples was significantly different for both TARGET (p<0.01) and BEATAML (p<0.01). CD33 expression between normal and AML samples was significant in BEATAML (p=0.012), but not TARGET (p=0.30). We combined the ST with SA to analyze free energy changes (FEC) that occurred at ST critical points, which predicted AML progression as normal hematopoiesis, c1, transition to AML, c2, and overt AML, c3. Significant differences in FEC at the critical points c1, c2, and c3 were observed by t-test comparisons as c1 vs. c2 (p <0.001) and c1 vs. c3 (p <0.001) in TARGET dataset, and c2 vs. c3 (p < 0.01) in the BEATAML dataset. Our results showed that the transition disease state, c2, had higher FEC as compared to normal states, c1, suggesting that samples were less stable from a thermodynamic perspective. Certain samples at the more advanced state, c3, exhibited a trend toward decreasing FEC levels, suggesting a process of state stabilization at more advanced stages of the disease. SA and Gene Ontology analysis suggested that interleukin pathways such as IL1, IL2, IL8 and IL10, MAPK, NFkB, and migration-related pathways, which are known to be involved in AML progression, were found in multiple unbalanced processes in SA characterizing AML across all three datasets. The IL1 pathway was found in 3 unbalanced processes characterizing TARGET dataset and in four processes characterizing TCGA and BEATAML. IL2 related categories were found among induced transcripts in 4 unbalanced processes characterizing TARGET dataset, in 6 characterizing TCGA, and in 4 processes characterizing BEATAML. We observed that AML samples with comparable FEC levels or clinical characteristics could be defined by different sets of unbalanced processes, implying that this information might be used to determine tailored therapies in subgroups of individuals with similar clinical or thermodynamic characteristics. We showed that mapping FEC into an AML state-space given by ST critical points could provide a high resolution, patient-specific disease characterization, and these methods could be used in personalized diagnostics and individualized treatment procedures.
Read full abstract