Identification and validation of novel marker genes to predict potential gestational diabetes mellitus patients by WGCNA and machine learning.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

To identify novel marker genes to predict potential gestational diabetes mellitus (GDM) patients. METHODS: Based on Gene Expression Omnibus (GEO) datasets, the differentially expressed genes (DEGs) between control and GDM were identified, followed by enrichment analysis and protein-protein interaction (PPI) network construction. Then, Weighted gene co-expression network analysis (WGCNA) was conducted to screen the key module genes, then the important genes were obtained. In addition, Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression, Support Vector Machine - Recursive Feature Elimination (SVM-RFE), and random forest (RF) were employed to identify the key genes. Receiver operating characteristic (ROC) analysis was performed to assess the diagnostic efficacy of key genes, and a nomogram was developed. The correlation between key genes and immune cells was analyzed, and miRNA-mRNA-TF network was constructed. A total of 257 DEGs were screened between control and GDM groups, and these DEGs were involved in p53 signaling pathway, cell cycle and oocyte meiosis pathways. Then PPI network was constructed, including 163 nodes and 5502 interaction relationships. After WGCNA and machine learning, a total of 4 key genes were obtained, including SNRPD3, NGDN, ANKRD36 and TAS2R20, followed by a nomogram was constructed. SNRPD3 was positively correlated with CD8 T cells. miRNA-mRNA-TF network was conducted, including 56 miRNAs, 4 mRNAs, and 32 TFs. Besides, luteolin PC3 UP, alsterpaullone PC3 UP, and solanine HL60 UP were associated with NGDN, and MeIQx CTD 00001739 was related to TAS2R20. Four key marker genes for predicting potential GDM were identified, including SNRPD3, NGDN, ANKRD36 and TAS2R20, and a nomogram was established for predicting potential GDM patients.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 13
  • 10.3389/fimmu.2024.1335112
Integrative analysis identifies oxidative stress biomarkers in non-alcoholic fatty liver disease via machine learning and weighted gene co-expression network analysis.
  • Feb 27, 2024
  • Frontiers in Immunology
  • Haining Wang + 9 more

Non-alcoholic fatty liver disease (NAFLD) is the most common chronic liver disease globally, with the potential to progress to non-alcoholic steatohepatitis (NASH), cirrhosis, and even hepatocellular carcinoma. Given the absence of effective treatments to halt its progression, novel molecular approaches to the NAFLD diagnosis and treatment are of paramount importance. Firstly, we downloaded oxidative stress-related genes from the GeneCards database and retrieved NAFLD-related datasets from the GEO database. Using the Limma R package and WGCNA, we identified differentially expressed genes closely associated with NAFLD. In our study, we identified 31 intersection genes by analyzing the intersection among oxidative stress-related genes, NAFLD-related genes, and genes closely associated with NAFLD as identified through Weighted Gene Co-expression Network Analysis (WGCNA). In a study of 31 intersection genes between NAFLD and Oxidative Stress (OS), we identified three hub genes using three machine learning algorithms: Least Absolute Shrinkage and Selection Operator (LASSO) regression, Support Vector Machine - Recursive Feature Elimination (SVM-RFE), and RandomForest. Subsequently, a nomogram was utilized to predict the incidence of NAFLD. The CIBERSORT algorithm was employed for immune infiltration analysis, single sample Gene Set Enrichment Analysis (ssGSEA) for functional enrichment analysis, and Protein-Protein Interaction (PPI) networks to explore the relationships between the three hub genes and other intersecting genes of NAFLD and OS. The distribution of these three hub genes across six cell clusters was determined using single-cell RNA sequencing. Finally, utilizing relevant data from the Attie Lab Diabetes Database, and liver tissues from NASH mouse model, Western Blot (WB) and Reverse Transcription Quantitative Polymerase Chain Reaction (RT-qPCR) assays were conducted, this further validated the significant roles of CDKN1B and TFAM in NAFLD. In the course of this research, we identified 31 genes with a strong association with oxidative stress in NAFLD. Subsequent machine learning analysis and external validation pinpointed two genes: CDKN1B and TFAM, as demonstrating the closest correlation to oxidative stress in NAFLD. This investigation found two hub genes that hold potential as novel targets for the diagnosis and treatment of NAFLD, thereby offering innovative perspectives for its clinical management.

  • Research Article
  • 10.1002/jgm.70044
Exploring Potential Hub Genes and Molecular Mechanisms Linking Cardia Carcinoma With Sjögren's Syndrome Based on Comprehensive Bioinformatics Analysis and Machine Learning.
  • Sep 1, 2025
  • The journal of gene medicine
  • Meng Qian + 7 more

Cardia carcinoma (CC) is a highly heterogeneous cancer with an increasing incidence worldwide. Gastroesophageal reflux disease has been identified as a risk factor for CC, and patients with Sjögren's syndrome (SS) are often reported to have esophageal motility disorders. This study aimed to identify potential hub genes and molecular processes for CC with SS. Four datasets were obtained from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) analysis and weighted gene coexpression network analysis (WGCNA) were conducted to identify shared genes between CC and SS. Functional enrichment analysis and protein-protein interaction (PPI) network construction were performed on these genes. Four machine learning algorithms, including random forest (RF), least absolute shrinkage and selection operator (LASSO), support vector machine-recursive feature elimination (SVM-RFE), and extreme gradient boosting (XGBoost), were applied to screen hub genes. Then, a nomogram predicting the risk of CC in SS patients was constructed and validated by the receiver operating characteristic (ROC) curve and calibration curve. Additionally, we analyzed the transcriptional regulatory relationships, coexpression networks, and correlations between the hub genes and immune infiltration. By intersecting DEGs and module genes identified by WGCNA, we screened 60 shared genes that were mainly enriched in cell cycle, response to xenobiotic stimulus, and p53 signaling pathways. Based on machine learning algorithms, three hub genes were identified and used to construct a nomogram with high predictive performance (the AUC for the training cohort and validation cohort were 0.991 and 0.978, respectively). Furthermore, the immune infiltration results suggested that T cells, mast cells, macrophages, and B cells play an important role in both diseases, and the hub genes were significantly associated with T cells and B cells. This study identified three hub genes (E2F3, CHIA, and SCNN1B) and established a nomogram that could effectively predict the risk of CC. The unbalanced immune response may be the common pathogenesis of these two diseases, which provides novel insights into the diagnosis and therapy of CC with SS.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 10
  • 10.1038/s41598-022-26345-1
Integrated multiple microarray studies by robust rank aggregation to identify immune-associated biomarkers in Crohn's disease based on three machine learning methods
  • Feb 15, 2023
  • Scientific Reports
  • Zi-An Chen + 6 more

Crohn's disease (CD) is a complex autoimmune disorder presumed to be driven by complex interactions of genetic, immune, microbial and even environmental factors. Intrinsic molecular mechanisms in CD, however, remain poorly understood. The identification of novel biomarkers in CD cases based on larger samples through machine learning approaches may inform the diagnosis and treatment of diseases. A comprehensive analysis was conducted on all CD datasets of Gene Expression Omnibus (GEO); our team then used the robust rank aggregation (RRA) method to identify differentially expressed genes (DEGs) between controls and CD patients. PPI (protein‒protein interaction) network and functional enrichment analyses were performed to investigate the potential functions of the DEGs, with molecular complex detection (MCODE) identifying some important functional modules from the PPI network. Three machine learning algorithms, support vector machine-recursive feature elimination (SVM-RFE), random forest (RF), and least absolute shrinkage and selection operator (LASSO), were applied to determine characteristic genes, which were verified by ROC curve analysis and immunohistochemistry (IHC) using clinical samples. Univariable and multivariable logistic regression were used to establish a machine learning score for diagnosis. Single-sample GSEA (ssGSEA) was performed to examine the correlation between immune infiltration and biomarkers. In total, 5 datasets met the inclusion criteria: GSE75214, GSE95095, GSE126124, GSE179285, and GSE186582. Based on RRA integrated analysis, 203 significant DEGs were identified (120 upregulated genes and 83 downregulated genes), and MCODE revealed some important functional modules in the PPI network. Machine learning identified LCN2, REG1A, AQP9, CCL2, GIP, PROK2, DEFA5, CXCL9, and NAMPT; AQP9, PROK2, LCN2, and NAMPT were further verified by ROC curves and IHC in the external cohort. The final machine learning score was defined as [Expression level of AQP9 × (2.644)] + [Expression level of LCN2 × (0.958)] + [Expression level of NAMPT × (1.115)]. ssGSEA showed markedly elevated levels of dendritic cells and innate immune cells, such as macrophages and NK cells, in CD, consistent with the gene enrichment results that the DEGs are mainly involved in the IL-17 signaling pathway and humoral immune response. The selected biomarkers analyzed by the RRA method and machine learning are highly reliable. These findings improve our understanding of the molecular mechanisms of CD pathogenesis.

  • Research Article
  • Cite Count Icon 10
  • 10.21037/atm-22-5979
Identification of immune- and autophagy-related genes and effective diagnostic biomarkers in endometriosis: a bioinformatics analysis
  • Dec 1, 2022
  • Annals of Translational Medicine
  • Xiujia Ji + 7 more

BackgroundTo identify autophagy- and immune-related hub genes affecting the diagnosis and treatment of endometriosis.MethodsGene expression data were downloaded from the Gene Expression Omnibus (GEO) (GSE11691 and GSE120103 for training, and GSE7305 for validation). By overlapping the differentially expressed genes (DEGs), Weighted gene co-expression network analysis (WGCNA) module genes, and autophagy-related genes (ARGs), and immune-related genes (IRGs) separately, hub genes were identified using the least absolute shrinkage and selection operator (LASSO)and support vector machine recursive feature elimination (SVM-RFE). The hub genes were analyzed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses. A hub gene-prediction model was constructed and assessed using five-fold cross-validation via five supervised machine-learning algorithms: random forest, the sequential minimal optimization (SMO), K-nearest neighbours (IBK), C4.5 decision tree (J48), and logistics regression. The area under the receiver operating characteristic curve (AUC) was adopted to assess the identification ability of characteristic genes.Results1,116 DEGs were obtained from the training cohort, and 22 endometriosis-related IRGs were identified by overlapping the 1,116 DEGs, 3,222 module genes, and 1,793 IRGs. Meanwhile, 45 endometriosis-related ARGs were obtained (1,928 ARGs). Subsequently, nine IRG hub genes (BST2, CCL13, CD86, CSF1, FAM3C, GREM1, ISG20, PSMB8, and S100A11) and nine ARG hub genes (GSK3A, HTR2B, RAB3GAP1, ARFIP2, BNIP3, CSF1, MAOA, PPP1R13L, and SH3GLB2) were obtained by LASSO and SVM-RFE. GO analysis indicated that the ARG hub genes responded to the regulation of autophagy and mitochondrial outer membrane permeabilization, and KEGG enrichment analysis involved serotonergic and dopaminergic synapses. GO analysis also indicated that the IRG hub genes responded to the regulation of leukocyte proliferation and mononuclear cell migration, and KEGG analysis showed enrichment involved in viral protein interaction with cytokines and cytokine receptors. The AUC of the random-forest algorithm of ARGs was 0.975 in the training cohort and 0.940 in the validation cohort, and the AUC of the SMO algorithm of IRGs was 0.907 in the training cohort and 0.8 in the validation cohort.ConclusionsSeventeen hub genes are closely associated with endometriosis. These genes are potential autophagy- and immune-related biomarkers for diagnosis and treatment of endometriosis.

  • Research Article
  • 10.21037/tcr-2024-2465
Integrated analysis of uterine leiomyosarcoma and leiomyoma utilizing TCGA and GEO data: a WGCNA and machine learning approach.
  • May 1, 2025
  • Translational cancer research
  • Zixin Yang + 3 more

Uterine sarcoma is a gynecological mesenchymal tumor with an elusive pathogenesis. The uterine leiomyosarcoma (LMS) is the most common subtype of uterine sarcoma. LMS is a highly aggressive tumor with a poor prognosis. The genomic landscape of LMS remains unclear. Rare cases of LMS are observed to arise from leiomyoma (LM). We conducted a study to explore the genomic relationship between LMS and LM using public microarray data from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA). Using bioinformatics analysis tools, we would like to provide molecular insight into the pathogenesis of LMS and to discover novel predictive biomarkers for this disease. LMS and LM differentially expressed genes (DEGs) were screened by analyzing GEO datasets; GSE764, GSE68312 and GSE64763; and TCGA data. A protein-protein interaction (PPI) network was constructed, and hub genes were identified utilizing the CytoHubba plug-in from Cytoscape software. In addition, weighted gene co-expression network analysis (WGCNA) was performed to identify hub genes. We took the intersection of the hub genes generated from the PPI network and WGCNA. Subsequently, random forest (RF) and support vector machine (SVM) algorithms were used to screen for key genes as predictive biomarkers. Finally, we constructed a nomogram with these genes. A total of 37 hub genes were selected using WGCNA. A total of 245 DEGs were identified; 63 DEGs were upregulated, and 182 DEGs were downregulated. Functional enrichment analysis revealed that these genes were mainly associated with the cell cycle, extracellular matrix receptor interactions and oocyte meiosis. The final hub genes were CENPA, KIF2C, TTK, MELK and CDC20. Gene set enrichment analysis (GSEA) revealed that these genes were mostly enriched in the cell cycle, mismatch repair and amino sugar and nucleotide sugar metabolism. Tumor-infiltrating immune cell analysis indicated that these genes did not have an obvious correlation with immune cells. CENPA, KIF2C, TTK, MELK and CDC20 were key genes significantly associated with LMS and LM. Functional enrichment analysis and tumor-infiltrating immune cell analysis indicated that these genes might be correlated with tumor proliferation, which might shed light on the possible pathogenesis and predictive biomarkers of LMS.

  • Research Article
  • 10.1016/j.burns.2025.107413
Identification and validation of immune-related biomarkers and polarization types of macrophages in keloid based on bulk RNA-seq and single-cell RNA-seq analysis.
  • Apr 1, 2025
  • Burns : journal of the International Society for Burn Injuries
  • Yuzhu Zhang + 7 more

Identification and validation of immune-related biomarkers and polarization types of macrophages in keloid based on bulk RNA-seq and single-cell RNA-seq analysis.

  • Research Article
  • 10.1186/s12967-025-07058-1
Screening and experimental study of potential biomarkers for ulcerative colitis based on weighted gene co-expression network analysis and machine learning.
  • Sep 30, 2025
  • Journal of translational medicine
  • Zepeng Chen + 3 more

Ulcerative colitis (UC) is a chronic nonspecific inflammatory intestinal disease affecting the mucosa and submucosa, characterized by continuous and diffuse active inflammation. However, its underlying pathogenesis remains unclear. This study aimed to identify potential UC biomarkers by integrating weighted gene co-expression network analysis (WGCNA) with machine learning, followed by validation in an experimental UC mouse model. The Gene Expression Omnibus database was systematically queried, and the GSE87466 dataset, comprising of colonic tissues from 87 patients with UC and 21 healthy controls, was retrieved. Differentially expressed genes (DEGs) were identified and subjected to Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses. WGCNA was used to extract UC-related DEGs. Two machine learning algorithms, the Least Absolute Shrinkage and Selection Operator (LASSO) and Support Vector Machine Recursive Feature Elimination (SVM-RFE), were used to screen potential biomarkers. These biomarkers were then validated using animal experiments. A total of 1,097 DEGs were identified. WGCNA constructed nine co-expression gene modules, with the turquoise module (520 genes) exhibiting the highest relevance to UC. LASSO and SVM-RFE analysis identified poly(ADP-ribose) polymerase family member 8 (PARP8) as a potential biomarker of UC. Immunological analysis revealed significantly higher proportions of naive B cells, activated CD4+ memory T cells, follicular helper T cells, γδT cells, M0 macrophages, M1 macrophages, activated mast cells, and neutrophils in UC samples compared to controls. PARP8 expression positively correlated with neutrophils, M1 macrophages, and activated CD4+ T cells, but negatively correlated with plasma cells. In vivo validation confirmed elevated PARP8 expression in dextran sulfate sodium-induced UC mice compared to controls. PARP8 may contribute to UC pathogenesis via immune-related pathways and holds promise as a diagnostic and predictive biomarker.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.intimp.2024.112899
Exploration of the shared diagnostic genes and mechanisms between periodontitis and primary Sjögren’s syndrome by integrated comprehensive bioinformatics analysis and machine learning
  • Aug 13, 2024
  • International Immunopharmacology
  • Shaoru Wang + 4 more

Exploration of the shared diagnostic genes and mechanisms between periodontitis and primary Sjögren’s syndrome by integrated comprehensive bioinformatics analysis and machine learning

  • Research Article
  • Cite Count Icon 1
  • 10.1007/s12672-025-02137-7
Identification of GJC1 as a novel diagnostic marker for papillary thyroid carcinoma using weighted gene co-expression network analysis and machine learning algorithm
  • Mar 17, 2025
  • Discover Oncology
  • Jingshu Zhang + 1 more

BackgroundThe incidence of thyroid papillary carcinoma (PTC) is increasing annually, causing both physical and psychological pressure on patients. Therefore, early recognition and specific interventions for PTC are crucial. The objective of this study is to explore novel diagnostic marker and precise intervention targets for PTC.MethodsBased on a weighted gene co-expression network analysis (WGCNA), relevant datasets from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases were collected. Enrichment analysis was performed on differentially expressed genes (DEGs) using Gene Ontology (GO), Disease Ontology (DO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Set Enrichment Analysis (GSEA). Subsequently, three machine learning algorithms Least Absolute Shrinkage and Selection Operator (LASSO), Support Vector Machine Recursive Feature Elimination (SVM-RFE), and Random Forest (RF) were used to identify the core genes. Finally, receiver operating characteristic (ROC) curves were used to analyze the clinical diagnostic value of the core genes.ResultsWe found, in total, 11,194 DEGs derived the TCGA and GEO datasets, that are primarily enriched in extracellular matrix (ECM) and inflammation related pathways, such as an ECM receptor interaction, cell adhesion molecules (CAMs), Tumor necrosis factor (TNF) signaling, and nucleotide-binding oligomerization domain (NOD) like receptor signaling pathways. Further analysis of the core genes, identified by the protein–protein interaction network, using three machine learning algorithms discovered three intersecting genes GJC1, KLHL4, and NOL4. Of which, GJC1 has good clinical diagnostic ability, which was verified using both the GEO (area under the ROC curve (AUC) = .982) and TCGA databases (AUC = .840).ConclusionsGJC1 is highly expressed in PTC. Therefore, it is considered as a potential biomarker and is expected to become a new target for PTC gene therapy. However, it still needs to be supported and verified by more clinical data.

  • Research Article
  • Cite Count Icon 9
  • 10.1016/j.bbrep.2023.101595
Identification of novel biomarkers and immune infiltration characteristics of ischemic stroke based on comprehensive bioinformatic analysis and machine learning
  • Dec 7, 2023
  • Biochemistry and Biophysics Reports
  • Shiyu Hu + 4 more

Identification of novel biomarkers and immune infiltration characteristics of ischemic stroke based on comprehensive bioinformatic analysis and machine learning

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 13
  • 10.3389/fimmu.2022.1072526
Identification of the osteoarthritis signature gene PDK1 by machine learning and its regulatory mechanisms on chondrocyte autophagy and apoptosis
  • Jan 6, 2023
  • Frontiers in Immunology
  • Jinzhi Meng + 5 more

BackgroundOsteoarthritis (OA) is a degenerative joint disease frequently diagnosed in the elderly and middle-aged population. However, its specific pathogenesis has not been clarified. This study aimed to identify biomarkers for OA diagnosis and elucidate their potential mechanisms for restoring OA-dysregulated autophagy and inhibiting chondrocyte apoptosis in vitro.Material and methodsTwo publicly available transcriptomic mRNA OA-related datasets (GSE10575 and GSE51588) were explored for biomarker identification by least absolute shrinkage and selection operator (LASSO) regression, weighted gene co-expression network analysis (WGCNA), and support vector machine recursive feature elimination (SVM-RFE). We applied the GSE32317 and GSE55457 cohorts to validate the markers’ efficacy for diagnosis. The connections of markers to chondrocyte autophagy and apoptosis in OA were also comprehensively explored in vitro using molecular biology approaches, including qRT-PCR and Western blot.ResultsWe identified 286 differentially expressed genes (DEGs). These DEGs were enriched in the ECM-receptor interaction and PI3K/AKT signaling pathway. After external cohort validation and protein-protein interaction (PPI) network construction, PDK1 was finally identified as a diagnostic marker for OA. The pharmacological properties of BX795-downregulated PDK1 expression inhibited LPS-induced chondrocyte inflammation and apoptosis and rescued OA-dysregulated autophagy. Additionally, the phosphorylation of the mediators associated with the MAPK and PI3K/AKT pathways was significantly downregulated, indicating the regulatory function of PDK1 in apoptosis and autophagy via MAPK and PI3K/AKT-associated signaling pathways in chondrocytes. A significantly positive association between the PDK1 expression and Neutrophils, Eosinophils, Plasma cells, and activated CD4 memory T cells, as well as an evident negative correlation between T cells follicular helper and CD4 naive T cells, were detected in the immune cell infiltration analysis.ConclusionsPDK1 can be used as a diagnostic marker for OA. Inhibition of its expression can rescue OA-dysregulated autophagy and inhibit apoptosis by reducing the phosphorylation of PI3K/AKT and MAPK signaling pathways.

  • Research Article
  • Cite Count Icon 3
  • 10.3389/fmolb.2024.1425143
Screening and identification of the hub genes in severe acute pancreatitis and sepsis.
  • Sep 19, 2024
  • Frontiers in molecular biosciences
  • Si-Jiu Yang + 3 more

Severe acute pancreatitis (SAP) is accompanied with acute onset, rapid progression, and complicated condition. Sepsis is a common complication of SAP with a high mortality rate. This research aimed to identify the shared hub genes and key pathways of SAP and sepsis, and to explore their functions, molecular mechanism, and clinical value. We obtained SAP and sepsis datasets from the Gene Expression Omnibus (GEO) database and employed differential expression analysis and weighted gene co-expression network analysis (WGCNA) to identify the shared differentially expressed genes (DEGs). Functional enrichment analysis and protein-protein interaction (PPI) was used on shared DEGs to reveal underlying mechanisms in SAP-associated sepsis. Machine learning methods including random forest (RF), least absolute shrinkage and selection operator (LASSO) and support vector machine recursive feature elimination (SVM-RFE) were adopted for screening hub genes. Then, receiver operating characteristic (ROC) curve and nomogram were applied to evaluate the diagnostic performance. Finally, immune cell infiltration analysis was conducted to go deeply into the immunological landscape of sepsis. We obtained a total of 123 DEGs through cross analysis between Differential expression analysis and WGCNA important module. The Gene Ontology (GO) analysis uncovered the shared genes exhibited a significant enrichment in regulation of inflammatory response. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that the shared genes were primarily involved in immunoregulation by conducting NOD-like receptor (NLR) signaling pathway. Three machine learning results revealed that two overlapping genes (ARG1, HP) were identified as shared hub genes for SAP and sepsis. The immune infiltration results showed that immune cells played crucial part in the pathogenesis of sepsis and the two hub genes were substantially associated with immune cells, which may be a therapy target. ARG1 and HP may affect SAP and sepsis by regulating inflammation and immune responses, shedding light on potential future diagnostic and therapeutic approaches for SAP-associated sepsis.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.1155/2023/2250772
Identification of Biomarkers Associated with Heart Failure Caused by Idiopathic Dilated Cardiomyopathy Using WGCNA and Machine Learning Algorithms
  • Apr 25, 2023
  • International Journal of Genomics
  • Mengyi Sun + 1 more

Background The genetic factors and pathogenesis of idiopathic dilated cardiomyopathy-induced heart failure (IDCM-HF) have not been understood thoroughly; there is a lack of specific diagnostic markers and treatment methods for the disease. Hence, we aimed to identify the mechanisms of action at the molecular level and potential molecular markers for this disease. Methods Gene expression profiles of IDCM-HF and non-heart failure (NF) specimens were acquired from the database of Gene Expression Omnibus (GEO). We then identified the differentially expressed genes (DEGs) and analyzed their functions and related pathways by using “Metascape”. Weighted gene co-expression network analysis (WGCNA) was utilized to search for key module genes. Candidate genes were identified by intersecting the key module genes identified via WGCNA with DEGs and further screened via the support vector machine-recursive feature elimination (SVM-RFE) method and the least absolute shrinkage and selection operator (LASSO) algorithm. At last, the biomarkers were validated and evaluated the diagnostic efficacy by the area under curve (AUC) value and further confirmed the differential expression in the IDCM-HF and NF groups using an external database. Results We detected 490 genes exhibiting differential expression between IDCM-HF and NF specimens from the GSE57338 dataset, with most of them being concentrated in the extracellular matrix (ECM) of cells related to biological processes and pathways. After screening, 13 candidate genes were identified. Aquaporin 3 (AQP3) and cytochrome P450 2J2 (CYP2J2) showed high diagnostic efficacy in the GSE57338 and GSE6406 datasets, respectively. In comparison to the NF group, AQP3 was significantly down-regulated in the IDCM-HF group, while CYP2J2 was significantly up-regulated. Conclusion As far as we know, this is the first study that combines WGCNA and machine learning algorithms to screen for potential biomarkers of IDCM-HF. Our findings suggest that AQP3 and CYP2J2 could be used as novel diagnostic markers and treatment targets of IDCM-HF.

  • Research Article
  • 10.3389/fimmu.2025.1677275
CHN1 as a potential predictive genetic biomarker for atopic dermatitis-related depression
  • Nov 17, 2025
  • Frontiers in Immunology
  • Yifei Wang + 4 more

IntroductionThe comorbidity of atopic dermatitis (AD) and depression has garnered increased attention in recent years, yet the immunopathological mechanisms underlying this connection remain unclear. To bridge this gap, the study aimed to uncover the immune regulatory networks and identify key genetic markers involved in the comorbidity of depression in AD.MethodsWe performed RNA sequencing on peripheral blood mononuclear cells (PBMCs) collected from 20 AD patients with and without depression. By integrating bioinformatics analyses with machine learning, we conducted weighted gene co-expression network analysis (WGCNA), functional enrichment analysis, and employed machine learning models of least absolute shrinkage and selection operator (LASSO) and support vector machine-recursive feature elimination (SVM-RFE). Additionally, validation was carried out in an independent cohort of 20 participants to confirm the expression of the identified potential pivotal gene.ResultsA total of 394 differentially expressed genes (DEGs) were identified in AD patients with depression as compared to those non-depressed counterparts. Weighted gene co-expression network analysis (WGCNA) pinpointed a pink module encompassing 83 genes strongly linked to depressive symptoms. Functional enrichment analysis highlighted biological processes related to neurotransmitter uptake and the negative regulation of T-helper (Th) 17 cell differentiation. Furthermore, machine learning models of least absolute shrinkage and selection operator (LASSO) and support vector machine-recursive feature elimination (SVM-RFE) consistently identified CHN1 as a potential pivotal gene upregulated in AD patients with depression. The expression level of CHN1 demonstrated positive correlation with Th2 and Th17 cytokine signatures, as well as with the Hospital Anxiety and Depression Scale-Depression (HADS-D) score, and the Eczema Area and Severity Index (EASI). Validation in an independent cohort of 20 participants confirmed the significant upregulation of CHN1 in depressed AD patients.DiscussionTogether, these findings reveal previously unrecognized immunoinflammatory axis underlying AD-associated depression, and shed light on CHN1 as a potential molecular bridge connecting peripheral inflammation and neuropsychiatric manifestations.

  • Research Article
  • 10.1080/10255842.2025.2510366
Exploring the genetic characteristics of overweight-related osteoarthritis using machine learning
  • May 23, 2025
  • Computer Methods in Biomechanics and Biomedical Engineering
  • Zhaohui Jiang + 6 more

This investigation employed a synergistic approach integrating bioinformatics and machine learning methodologies to scrutinize overweight-related osteoarthritis characteristic genes (OROCGs). The research team procured gene expression profiles from osteoarthritis (OA) patients’ cartilage and meniscus, derived from GEO database datasets GSE98918 and GSE117999. These profiles underwent meticulous examination through differential gene expression (DEG) identification, weighted gene co-expression network analysis (WGCNA), least absolute shrinkage and selection operator (LASSO), support vector machine - recursive feature elimination (SVM-RFE), and single-sample gene set enrichment analysis (ssGSEA), culminating in the identification of six OROCGs. Furthermore, the study unveiled an augmented presence of myeloid-derived suppressor cells (MDSCs) and B cells in overweight-associated OA. The investigators formulated a diagnostic model encompassing pivotal genes related to DNA replication, chronic inflammation, and epigenetics, including CHTH18, CYSLTR2, HSF4, KDM6B, NR4A2, and UCKL1. The model’s diagnostic precision was corroborated through receiver operating characteristic (ROC) curves and a nomogram applied to the test set and validation set GSE129147. This model efficaciously delineates the expression alterations and immune infiltration linked to overweight-related OA, thereby nominating these genes as prospective candidates for immunomodulatory therapeutic interventions.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon