Abstract

Abstract Mucosal barrier dysfunction and aberrant mucin expression are major hallmarks in the pathophysiology of IBD. Mucins are highly polymorphic, and the presence of genetic differences can alter gene expression, resulting in several mRNA isoforms via alternative splicing. While most isoforms encode similar biological functions, others alter protein function, potentially resulting in progression towards disease. Currently, little attention has been given to the importance of mucin mRNA isoforms in IBD. The aim of our study is to investigate the potential of mucin mRNA isoforms as novel biomarkers for the evaluation of IBD activity and subtypes. To obtain this goal, RNA was extracted from colonic and terminal ileal biopsies of IBD patients that underwent an endoscopy at the Antwerp University Hospital (UZA). Additionally, patients without a history of IBD undergoing an endoscopy due to a positive iFOBT which show no endoscopic abnormalities, were included as controls. Library preparation was performed with the PacBio Iso-Seq multiplex protocol adapted for targeted transcriptome sequencing. Targeted capture was accomplished by using a custom-designed pool of probes, developed for the capture of all mucin gene transcripts. Samples were sequenced on the PacBio Sequel platform at the University of Antwerp. The data was analyzed by using the isoseq3-pipeline and additional filtering with SQANTI3 was performed. In total 106 biopsies were sequenced on the PacBio platform. The resulting intestinal mucin transcriptome was merged with the human reference transcriptome. On this combined mucin transcriptome Illumina bulk RNA sequencing data from over 2000 intestinal biopsies (GEO dataset GSE193677) were mapped to determine mucin isoform expression. An external dataset (GEO dataset GSE165512) was used for additional validation. A classification random forest was trained on this data to distinguish inflamed IBD from non-inflamed control patients based on the mucin isoform expression alone. The model performed well on train and test datasets but decreased in the external validation (Table 1). Dividing the samples based on disease phenotype greatly increases performance on the external validation dataset (Table 1). When only training on ileal biopsies, the model proved to be excellent in distinguishing Crohn’s disease patients from controls (Table 1). Classification of inflamed ulcerative colitis from control patients based on only the biopsies from the distal colon was similar to the latter (Table 1). Our machine learning model was able to distinguish Crohn’s disease from control patients and ulcerative colitis from control patients based on mucin mRNA isoform expression. In addition, our data suggests that the mucin isoform expression may vary between different regions within the colon. Table 1 Area under the receiver operator curve (AUCROC) from a random forest classifier trained on mucin isoform expression of inflamed biopsies of IBD and non-inflamed control patients.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call