Abstract

Biliary atresia (BA) is a devastating progressive fibro inflammatory disorder in infants. The exact etiology of BA is still unclear. This study aimed screen key genes potentially associated with the occurrence of BA. All BA data was obtained from GSE46960 dataset. The limma package in R language was used for differentially expressed gene (DEG) analyses. gene ontology and Kyoto encyclopedia of genes and genomes enrichment analysis were performed on the screened DEGs, using "clusterProfiler" package. protein-protein interaction network was built based on STRING Cytoscape software (Bethesda, Rockville, MD). The logistic regression model was constructed based on the selected DEGs. There were totally 78 DEGs in BA samples compared with normal samples, which were significantly enriched in 200 biological process terms, 37 molecular function terms, 17 cellular component terms, and 18 Kyoto encyclopedia of genes and genomes pathways. Among which, the top 10 genes with the highest importance in protein-protein interaction network were selected. Subsequently, on the basis of the stepwise regression method and 5-fold cross-validation, the logistic regression model constructed based on COL3A1, CXCL8, VCAN, THBS2, and COL1A2 was finally evidenced to predict the BA sample relatively reliably. In conclusion, COL3A1, CXCL8, VCAN, THBS2, and COL1A2 are potentially crucial genes in BA. The logistic regression model constructed based on them could predict the BA sample relatively reliably.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call