Abstract Background The gut microbiome and its metabolites are key to understanding the pathogenesis of gastrointestinal diseases (GID) like gastric cancer (GC), inflammatory bowel disease (IBD), and colon cancer (CC), aiding in the identification of shared and disease-specific biomarkers for personalised interventions. This study used machine learning (ML) models, metabolic modelling and network analysis methods to assess the risk of developing one GID following another, especially as IBD patients are at higher risk of CC, and GC patients can develop CC. Methods Microbiome and metabolome datasets from Erawijantari et al. (GC), Franzosa et al. (IBD), and Yachida et al. (CC) were subjected to three ML models, eXtreme gradient boosting (XGBoost), Random forest (RF) and Least absolute shrinkage and selection operator (LASSO). The models were refined using Recursive Feature Elimination & Cross-Validation (RFECV) and LASSO. Trained models were applied for within and cross disease analyses using common microbes and metabolites. Microbial abundance analysis was conducted to identify compositional differences between the diseased and healthy patients. Microbial Community Modelling (MICOM) simulated microbial interactions to evaluate contributions to metabolite production. Using Weighted Gene Co-expression Network Analysis (WGCNA), A cluster dendrogram was generated to identify modules of highly correlated metabolic features based on co-expression patterns. Results ML models for the GC model had LASSO, achieving the highest AUC 0.96 [0.85-1.00]. For IBD, RF performed the best at AUC 0.93 [0.86-0.98]. CC model had XGBoost as the top performer AUC 0.85 [0.75-0.94]. Biomarkers from the GC model were used to predict CC and IBD outcomes, while biomarkers from IBD and CC models were similarly tested across diseases, having strong performance scores (Figure 2). Abundance analysis had lower levels of Bacteroidaceae in GC compared to healthy patients and higher Ruminococcaceae and Lachnospiraceae in IBD and CC. MICOM identified six metabolites in GC (adenosine, alanine, glycerophosphate, methionine, methionine sulfoxide, nicotinate), two in IBD (cholate, cytosine) and seven in CC (1-methyladenosine, alanine, glutamate, histidine, lactate, nicotinamide, tryptophan) all differentially produced by the selected microbes. WGCNA revealed four modules for GC (β=18,R²=0.91), eight modules for IBD (β=16,R²= 0.56), and two modules for CC (β=19,R²=0.27). Conclusion Despite study limitations, results suggest that the gut microbiome and its metabolites influence disease progression and aid in optimising therapeutic strategies, such as faecal microbiota transplantation (FMT), particularly in IBD and CC, potentially addressing both disease prevention and progression.
Read full abstract