BackgroundAn in-depth understanding of the key molecules and associated mechanisms involved in acute myeloid leukemia (AML) carcinogenesis, proliferation, and relapse is critical. This provides a basis for disease screening, early diagnosis, and development of effective treatment strategies and prognosis.MethodsWe downloaded AML transcription data sets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Differentially expressed genes (DEGs) were screened by R software and limma packages. Gene Ontology (GO) functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed on DEGs by public databases. In the DEG set, a random forest algorithm was used to identify characteristic genes of AML. The receiver operator characteristic (ROC) curve was used to evaluate the diagnostic efficacy of selected characteristic genes, which provided clues for the discovery of early diagnostic markers. The Estimate score was calculated using the Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE) algorithm. Spearman’s correlation test was used to explore the correlation between characteristic genes and Estimate Score, which provided clues for clarifying the potential pathogenic mechanism of key genes.ResultsA total of 1,494 DEGs were identified from AML samples and normal samples, among which 1,181 genes were upregulated and 313 genes were downregulated in AML. There were 2 genes with a mean decrease Gini >2, namely, CDC20 and ESM1, respectively. The ROC curve showed that the area under the curve (AUC) of CDC20 was 0.966, and the 95% confidence interval (CI) was (0.939 to 0.987) (P<0.001). The AUC of ESM1 was 0.905, and 95% CI: 0.849 to 0.953 (P<0.001). Correlation analysis showed that CDC20 expression was negatively correlated with Estimate Score (R=−0.21, P=0.0036) in AML. The expression of ESM1 was negatively correlated with Estimate Score (R=−0.57, P<0.001).ConclusionsThe genes CDC20 and ESM1 were identified as AML characteristic genes by random forest algorithm. Both CDC20 and ESM1 have good diagnostic efficacy for AML. They may play a carcinogenic role by promoting tumor cell proliferation and inhibiting immune cell chemotaxis, which are potential biological markers.
Read full abstract