Abstract

BackgroundThis study aimed to screen the feature modules and characteristic genes related to ulcerative colitis (UC) and construct a support vector machine (SVM) classifier to distinguish UC patients.MethodsFour datasets that contained UC and control samples were obtained from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) with consistency were screened via the MetaDE method. The weighted gene coexpression network (WGCNA) was used to distinguish significant modules based on the four datasets. The protein–protein interaction network was established based on intersection genes. Enrichment analysis of Gene Ontology (GO) biological processes (BPs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment were established based on DAVID. An SVM combined with recursive feature elimination was also applied to construct a disease classifier for the disease diagnosis of UC patients. The efficacy of the SVM classifier was evaluated through receiver operating characteristic curves.ResultsTwelve highly preserved modules were obtained using the WGCNA, and 2009 DEGs with significant consistency were selected using the MetaDE method. Sixteen significantly related GO BPs and 12 KEGG pathways were obtained, such as cytokine-cytokine receptor interaction, cell adhesion molecules, and leukocyte transendothelial migration. Subsequently, 41 genes were used to construct an SVM classifier, such as CXCL1, CCR2, IL1B, and IL1A. The area under the curve (AUC) was 0.999 in the training dataset, whereas the AUC was 0.886, 0.790, and 0.819 in the validation set (GSE65114, GSE37283, and GSE36807, respectively).ConclusionsAn SVM classifier based on feature genes might correctly identify healthy people or UC patients.

Highlights

  • This study aimed to screen the feature modules and characteristic genes related to ulcerative colitis (UC) and construct a support vector machine (SVM) classifier to distinguish UC patients

  • Biasci et al [3] reported that genes from the best classifiers are optimized by quantitative polymerase chain reaction and the best qPCR classifier is distinguished using further machine learning, which could evaluate the prognosis of newly diagnosed UC patients

  • There may be different degrees of bias in the dataset; MetaQC was first used to carry out objective quality-control on the datasets combined with principal component analysis (PCA) two-dimensional map and standardized mean rank to evaluate and screen datasets

Read more

Summary

Introduction

This study aimed to screen the feature modules and characteristic genes related to ulcerative colitis (UC) and construct a support vector machine (SVM) classifier to distinguish UC patients. Zhang et al [7] reported that IL6, PTPRC, CXCL8, IL1B, and MMP9 might be the key genes that could provide vital markers for the early diagnosis and treatment for UC. Yan et al [9] found 11 mutated genes differentially expressed in UC samples, such as APC, APOB, MECP2, NCOR2, and USP48.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call