Abstract

Abstract Background Inflammatory bowel disease (IBD) has been shown to be associated with alterations in the intestinal microbiome. However, the precise nature of these microbial changes remains unclear. With billions of microbes present within the gut, novel and powerful computational techniques are required to identify the relevant shifts in microbiota contributing to the disease. Machine learning (ML) allows a data-driven approach to identify these discrete dynamic changes, while the findings of the ML algorithms can be interpreted using systems biology (SB) techniques. By combining ML and SB approaches, we aim to characterise key microbial factors in IBD pathogenesis, distinct patterns of variability in a diverse patient cohort and provide a method for patient stratification. Methods The causal relationship between the changes in the gut microbiome and IBD is difficult to establish. Data from cross-sectional studies are plagued by confounding factors and inconsistencies between cohorts. To overcome this, the authors used rich longitudinal datasets and integrated metagenomic, multi-omic and clinical patient data. This workflow has been validated using large longitudinal IBD databases, including data from IBDMDB. We assessed the performance of the ML models using well-documented performance metrics to ensure the outcomes were robust. Results As a baseline, we used multiple ML models to predict disease type (UC, CD and non-IBD) from integrated multi-omics profiles. We analysed multiple ML techniques, including linear (e.g. linear mixed model), non-linear (e.g. Random Forest), time-series models (e.g. Rotation Forest) and deep learning models (e.g. long short-term memory network model). The authors identified the models which would allow flexibility to analyse the dynamic nature of the microbiome and allow integration of the microbiome data with clinical patient data. The payoff of greater flexibility was a reduction in the model performance in terms of identifying specific features from the metagenomics that could be used as biomarkers. However, we were able to identify connections between microbial and host proteins relevant to IBD and were able to stratify these by the patient’s metagenomic data. Conclusion We have developed an integrated ml-based microbiome analysis pipeline to identify biomarkers for IBD from longitudinal metagenomic data. Furthermore, using a variety of SB approaches, we were able to interpret the predicted key microbial features and communities by inferring connections between microbial and host proteins. This pipeline will enable us to analyse vast amounts of patient microbiome data in the context of clinical and metagenomic data, to allow identification of biomarkers for disease subtypes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.