Abstract Disclosure: K. Stefanakis: None. E. Axarloglou: None. C.S. Mantzoros: None. Metabolic dysfunction-associated steatotic liver disease (MASLD) is characterized by intrahepatic lipid accumulation, may precipitate steatohepatitis (MASH) and liver fibrosis, and is a significant cardiometabolic risk factor. Liver biopsy, though invasive and costly, is the diagnostic gold standard, thus creating an unmet clinical need for accurate and reliable non-invasive diagnostic methods to diagnose MASLD, particularly the combination of MASH and significant fibrosis, classified as “at-risk MASH” and known to markedly increase morbidity and mortality. We aimed to leverage novel machine-learning (ML) algorithms to create lightweight but highly precise models that may accurately diagnose at-risk MASH and related histological outcomes through a top-down approach. This is a study of 443 participants from two Gastroenterology-Hepatology Clinics (Greece, Australia) and one Metabolic Surgery Clinic (Italy), including healthy controls and MASLD patients across the entire spectrum of the disease. All participants underwent liver biopsies, anthropometric, standard biochemical and additional hormonal measurements, and global serum metabolomics using ultra-high-performance liquid chromatography-tandem mass spectrometry, followed by robust batch correction and quality control, eventually identifying 687 known and 152 unknown metabolites. The dataset was randomly divided into a discovery (n=353) and a validation cohort (n=90) ensuring a balanced distribution of at-risk MASH and clinic. Our main ML pipeline implemented Categorical Gradient Boosting Machines using the CatBoost package for R. We pre-specified domain knowledge variables and implemented recursive feature elimination, among additional techniques, to build models through an iterative, algorithm-based process. The models were tested in the validation and entire cohort using k-fold and leave-one-out cross-validation with random resampling, and further assessed on relevant subgroups (MASLD status, obesity, diabetes, clinic, country). The main ML model consists of three routine clinical and biochemical variables, supplemented by two aminoacid metabolites (full model to be shown), and can detect the presence of at-risk MASH with a mean AUROC of 93% and mean accuracy of 92% in the validation cohort, and AUCS>93% with very high sensitivities and specificities per J-statistic in all subgroups. We likewise present ML models for the detection of MASLD, MASH, and fibrosis grades, with AUCs of over 90% in the validation cohort and all subgroups. Our models significantly outperform all applicable biomarker-based indices per DeLong’s tests. Further enhancement with lipidomic and proteomic variables will fully delineate the circulating biochemical snapshot of MASLD and likely lead to even more accurate, similarly structured models. Presentation: 6/1/2024
Read full abstract