Single‐Cell Transcriptomics and Integrated Bioinformatic Analysis Reveal Critical Biomarkers and Immune Infiltration Characteristics in Osteoarthritis
BackgroundOsteoarthritis (OA) is a complex, progressive joint disease characterized by cartilage degradation and inflammation. Traditional bulk tissue analyses have limited our understanding of the cellular diversity within OA tissues.MethodsThis study employed scRNA‐seq and integrated bioinformatic analyses to investigate the cellular composition and molecular pathways involved in OA. Publicly available datasets were analyzed to identify differentially expressed genes (DEGs) and enriched pathways. The genes, such as NR4A2, BMP1, and AVPR1A, were selected for further analysis. Molecular docking studies were conducted to explore the interaction with two identified compounds. Additionally, immune infiltration characteristics were analyzed using gene set variation analysis (GSVA) and correlation with key OA‐associated genes.ResultsWe analyzed cartilage samples from OA and normal individuals (GSE220243) and identified eight distinct chondrocyte subpopulations, with significant pathway enrichment in TNF, TGF‐β, and PI3K–Akt signaling pathways. Further differential gene expression analysis of GSE114007 identified 2247 genes, including 26 key OA‐associated drug targets, such as NR4A2, BMP1, and AVPR1A, which demonstrated strong diagnostic potential (AUC > 0.70) across multiple cohorts. Immune infiltration analysis revealed significant correlations between these key genes and immune cell subsets, highlighting their roles in the inflammatory microenvironment of OA. Additionally, molecular docking studies suggested that bexarotene has a favorable binding affinity for NR4A2, BMP1, and AVPR1A, making it a promising therapeutic candidate.ConclusionOur findings provide new insights into the molecular landscape of OA, offering valuable biomarkers and therapeutic targets for future OA interventions.
- Research Article
- 10.1038/s41598-025-97623-x
- Apr 19, 2025
- Scientific Reports
Major depressive disorder (MDD) is a multifactorial disorder involving genetic and environmental factors, with unclear pathogenesis. This study aims to explore the pathogenic pathway of MDD and its relationship with immune responses and to discover its potential targets by bioinformatics methods. We first applied gene set variation analysis (GSVA) and seven different immune infiltration algorithms to the GSE98793 dataset to determine the differences in signaling pathways, metabolic pathways, and immune cell infiltration between MDD patients and healthy controls. Differentially expressed genes between MDD patients and controls were obtained from five datasets (GSE98793, GSE32280, GSE38206, GSE39653, and GSE52790), and 113 machine learning methods were employed to construct MDD diagnostic models. Based on the constructed MDD diagnostic models, MDD patients were divided into high-risk and low-risk groups. GSVA and immune microenvironment analyses were conducted to investigate the differences between the two groups. Furthermore, potential drugs and therapeutic targets for the high-risk MDD group were explored to provide new insights and directions for the precise treatment of MDD. GSVA and immune infiltration results indicate that patients with MDD exhibit differences from normal individuals in various aspects, including biological processes, signaling pathways, metabolic processes, and immune cells. To investigate the functions and biological significance of differentially expressed genes in MDD patients, we performed GO and KEGG enrichment analyses on the differentially expressed genes from five databases (GSE98793, GSE32280, GSE38206, GSE39653, and GSE52790). By comparing the enrichment results across the five datasets, we found that the cell-killing signaling pathway was consistently present in the enriched signaling pathways of all datasets, suggesting that this pathway may play a crucial role in the pathogenesis of MDD. The random forest algorithm (AUC = 0.788) was selected as the optimal algorithm from 113 machine learning algorithms, leading to the development of a robust and predictive MDD algorithm, highlighting the important role of NPL in MDD. By dividing MDD into high and low-risk subgroups based on diagnostic model scores, enrichment pathways, and immunological results further demonstrated that high-risk MDD is associated with increased levels of reactive oxygen species, inflammation, and numbers of T cells and B cells. Through GSEA scoring, five upregulated pathways in the high-risk MDD group were identified, and multiple potential drugs such as Mibefradil, LY364947, ZLN005, STA- 5326, and vemurafenib were screened. Patients with MDD show differences in signaling pathways, metabolic pathways, and immune mechanisms. By constructing an MDD diagnostic model, we predicted the key genes of MDD and the characteristic pathways associated with a higher risk of MDD. This provides new insights for risk stratification identification and offers new perspectives for the clinical application of precision immunotherapy and drug development.
- Research Article
62
- 10.3389/fimmu.2022.1074271
- Nov 17, 2022
- Frontiers in Immunology
BackgroundCrohn’s disease (CD) is a type of heterogeneous, dysfunctional immune-mediated intestinal chronic and recurrent inflammation caused by a variety of etiologies. Cuproptosis is a newly discovered form of programmed cell death that seems to contribute to the advancement of a variety of illnesses. Consequently, the major purpose of our research was to examine the role of cuproptosis-related genes in CD.MethodsWe obtained two CD datasets from the gene expression omnibus (GEO) database, and immune cell infiltration was created to investigate immune cell dysregulation in CD. Based on differentially expressed genes (DEGs) and the cuproptosis gene set, differentially expressed genes of cuproptosis (CuDEGs) were found. Then, candidate hub cuproptosis-associated genes were found using machine learning methods. Subsequently, using 437 CD samples, we explored two distinct subclusters based on hub cuproptosis-related genes. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment, Gene set variation analysis (GSVA) and immune infiltration analysis studies were also used to assess the distinct roles of the subclusters.ResultsOverall, 25 CuDEGs were identified, including ABCB6, BACE1, FDX1, GLS, LIAS, MT1M, PDHA1, etc. And most CuDEGs were expressed at lower levels in CD samples and were negatively related to immune cell infiltration. Through the machine learning algorithms, a seven gene cuproptosis-signature was identified and two cuproptosis-related subclusters were defined. Cluster-specific differentially expressed genes were found only in one cluster, and functional analysis revealed that they were involved in several immune response processes. And the results of GSVA showed positive significant enrichment in immune-related pathways in cluster A, while positive significant enrichment in metabolic pathways in cluster B. In addition, an immune infiltration study indicated substantial variation in immunity across different groups. Immunological scores were higher and immune infiltration was more prevalent in Cluster A.ConclusionAccording to the current research, the cuproptosis phenomenon occurs in CD and is correlated with immune cell infiltration and metabolic activity. This information indicates that cuproptosis may promote CD progression by inducing immunological response and metabolic dysfunction. This research has opened new avenues for investigating the causes of CD and developing potential therapeutic targets for the disease.
- Research Article
42
- 10.3389/fcvm.2021.624714
- Feb 1, 2021
- Frontiers in Cardiovascular Medicine
Objectives: Idiopathic pulmonary arterial hypertension (IPAH) is a rare but severe lung disorder, which may lead to heart failure and early mortality. However, little is known about the etiology of IPAH. Thus, the present study aimed to establish the differentially expressed genes (DEGs) between IPAH and normal tissues, which may serve as potential prognostic markers in IPAH. Furthermore, we utilized a versatile computational method, CIBERSORT to identify immune cell infiltration characteristics in IPAH.Materials and Methods: The GSE117261 and GSE48149 datasets were obtained from the Gene Expression Omnibus database. The GSE117261 dataset was adopted to screen DEGs between IPAH and the control groups with the criterion of |log2 fold change| ≥ 1, adjusted P < 0.05, and to further explore their potential biological functions via Gene Ontology analysis, Kyoto Encyclopedia of Genes and Genomes Pathway analysis, and Gene Set Enrichment Analysis. Moreover, the support vector machine (SVM)-recursive feature elimination and the least absolute shrinkage and selection operator regression model were performed jointly to identify the best potential biomarkers. Then we built a regression model based on these selected variables. The GSE48149 dataset was used as a validation cohort to appraise the diagnostic efficacy of the SVM classifier by receiver operating characteristic (ROC) analysis. Finally, immune infiltration was explored by CIBERSORT in IPAH. We further analyzed the correlation between potential biomarkers and immune cells.Results: In total, 75 DEGs were identified; 40 were downregulated, and 35 genes were upregulated. Functional enrichment analysis found a significantly enrichment in heme binding, inflammation, chemokines, cytokine activity, and abnormal glycometabolism. HBB, RNASE2, S100A9, and IL1R2 were identified as the best potential biomarkers with an area under the ROC curve (AUC) of 1 (95%CI = 0.937–1.000, specificity = 100%, sensitivity = 100%) in the discovery cohort and 1(95%CI = 0.805–1.000, specificity = 100%, sensitivity = 100%) in the validation cohort. Moreover, immune infiltration analysis by CIBERSORT showed a higher level of CD8+ T cells, resting memory CD4+ T cells, gamma delta T cells, M1 macrophages, resting mast cells, as well as a lower level of naïve CD4+ T cells, monocytes, M0 macrophages, activated mast cells, and neutrophils in IPAH compared with the control group. In addition, HBB, RNASE2, S100A9, and IL1R2 were correlated with immune cells.Conclusion: HBB, RNASE2, S100A9, and IL1R2 were identified as potential biomarkers to discriminate IPAH from the control. There was an obvious difference in immune infiltration between patient with IPAH and normal groups.
- Research Article
15
- 10.3389/fimmu.2023.992765
- Jan 26, 2023
- Frontiers in Immunology
IntroductionRecurrent implantation failure (RIF) is a frustrating challenge because the cause is unknown. The current study aims to identify differentially expressed genes (DEGs) in the endometrium on the basis of immune cell infiltration characteristics between RIF patients and healthy controls, as well as to investigate potential prognostic markers in RIF.MethodsGSE103465, and GSE111974 datasets from the Gene Expression Omnibus database were obtained to screen DEGs between RIF and control groups. Gene Ontology analysis, Kyoto Encyclopedia of Genes and Genomes Pathway analysis, Gene Set Enrichment Analysis, and Protein-protein interactions analysis were performed to investigate potential biological functions and signaling pathways. CIBERSORT was used to describe the level of immune infiltration in RIF, and flow cytometry was used to confirm the top two most abundant immune cells detected.Results122 downregulated and 66 upregulated DEGs were obtained between RIF and control groups. Six immune-related hub genes were discovered, which were involved in Wnt/-catenin signaling and Notch signaling as a result of our research. The ROC curves revealed that three of the six identified genes (AKT1, PSMB8, and PSMD10) had potential diagnostic values for RIF. Finally, we used cMap analysis to identify potential therapeutic or induced compounds for RIF, among which fulvestrant (estrogen receptor antagonist), bisindolylmaleimide-ix (CDK and PKC inhibitor), and JNK-9L (JNK inhibitor) were thought to influence the pathogenic process of RIF. Furthermore, our findings revealed the level of immune infiltration in RIF by highlighting three signaling pathways (Wnt/-catenin signaling, Notch signaling, and immune response) and three potential diagnostic DEGs (AKT1, PSMB8, and PSMD10).ConclusionImportantly, our findings may contribute to the scientific basis for several potential therapeutic agents to improve endometrial receptivity.
- Research Article
7
- 10.3389/fimmu.2024.1456083
- Sep 16, 2024
- Frontiers in Immunology
IntroductionHeart failure (HF) and kidney failure (KF) are closely related conditions that often coexist, posing a complex clinical challenge. Understanding the shared mechanisms between these two conditions is crucial for developing effective therapies.MethodsThis study employed transcriptomic analysis to unveil molecular signatures and novel biomarkers for both HF and KF. A total of 2869 shared differentially expressed genes (DEGs) were identified in patients with HF and KF compared to healthy controls. Functional enrichment analysis was performed to explore the common mechanisms underlying these conditions. A protein-protein interaction (PPI) network was constructed, and machine learning algorithms, including Random Forest (RF), Support Vector Machine-Recursive Feature Elimination (SVM-RFE), and Least Absolute Shrinkage and Selection Operator (LASSO), were used to identify key signature genes. These genes were further analyzed using Gene Set Variation Analysis (GSVA) and Gene Set Enrichment Analysis (GSEA), with their diagnostic values validated in both training and validation sets. Molecular docking studies were conducted. Additionally, immune cell infiltration and correlation analyses were performed to assess the relationship between immune responses and the identified biomarkers.ResultsThe functional enrichment analysis indicated that the common mechanisms are associated with cellular homeostasis, cell communication, cellular replication, inflammation, and extracellular matrix (ECM) production, with the PI3K-Akt signaling pathway being notably enriched. The PPI network revealed two key protein clusters related to the cell cycle and inflammation. CDK2 and CCND1 were identified as signature genes for both HF and KF. Their diagnostic value was validated in both training and validation sets. Additionally, docking studies with CDK2 and CCND1 were performed to evaluate potential drug candidates. Immune cell infiltration and correlation analyses highlighted the immune microenvironment, and that CDK2 and CCND1 are associated with immune responses in HF and KF.DiscussionThis study identifies CDK2 and CCND1 as novel biomarkers linking cell cycle regulation and inflammation in heart and kidney failure. These findings offer new insights into the molecular mechanisms of HF and KF and present potential targets for diagnosis and therapy.
- Research Article
- 10.3760/cma.j.cn112148-20250606-00418
- Dec 24, 2025
- Zhonghua xin xue guan bing za zhi
Objective: Investigate key genes influencing vascular calcification through bioinformatics analysis and experimental validation. Methods: Three vascular calcification datasets (GSE159832, GSE229679 and GSE37558) were obtained from the Gene Expression Omnibus database. Subsequently, gene ontology (GO), Kyoto encyclopedia of genes and genomes (KEGG), and conventional gene set enrichment analysis (GSEA) were performed on the common differential expressed genes(DEGs). For in vitro validation, a vascular smooth muscle cell calcification model was established by stimulating mouse primary vascular smooth muscle cells with high phosphate and calcium chloride (Pi+CaCl2). Cells were divided into a control group and a Pi+CaCl2 group. To investigate the role of TK1, cells were transfected with TK1-targeting siRNA (siTK1) or control siRNA (siControl) prior to Pi+CaCl2 stimulation, creating siControl+Pi+CaCl2 and siTK1+Pi+CaCl2 groups. The association between key DEGs and vascular calcification was assessed at the protein and mRNA levels using Western blot and quantitative real-time PCR, respectively. Changes in the phosphorylation of the downstream effector, AKT (p-AKT/AKT), were also measured. Results: A total of 2275, 449, and 381 DEGs were identified from the three vascular calcification datasets (GSE159832, GSE229679, and GSE37558), respectively. Two common DEGs-phosphoserine aminotransferase 1 and thymidine kinase 1 (TK1)-were identified across all datasets. GO enrichment analysis revealed that TK1 was significantly enriched in pathways related to ribosome biogenesis, assembly, and rRNA processing and maturation. GSEA-KEGG analysis indicated significant enrichment in the PI3K-AKT signaling pathway, pathways in cancer, neurodegenerative diseases, cytoskeleton, and smooth muscle contraction. Conventional GSEA of TK1 further confirmed significant enrichment in pathways including dynein, epithelial tight junctions, axon guidance, and vascular smooth muscle contraction pathways. At the experimental level, both protein and mRNA expression of TK1, along with the p-AKT/AKT ratio, were significantly lower in the Pi+CaCl2 group compared to the control group (all P<0.05). Furthermore, compared to the siControl+Pi+CaCl2 group, the siTK1+Pi+CaCl2 group exhibited decreased expression of differentiation markers, increased expression of calcification markers, and a further reduced p-AKT/AKT ratio (all P<0.05). Conclusion: Integrated bioinformatics and cellular validation demonstrate a correlation between TK1 expression and vascular calcification, suggesting a potential protective role for TK1 in this pathological process.
- Research Article
4
- 10.1186/s12935-021-02409-6
- Dec 1, 2021
- Cancer Cell International
BackgroundSkin cutaneous melanoma (SKCM) is the most common skin tumor with high mortality. The unfavorable outcome of SKCM urges the discovery of prognostic biomarkers for accurate therapy. The present study aimed to explore novel prognosis-related signatures of SKCM and determine the significance of immune cell infiltration in this pathology.MethodsFour gene expression profiles (GSE130244, GSE3189, GSE7553 and GSE46517) of SKCM and normal skin samples were retrieved from the GEO database. Differentially expressed genes (DEGs) were then screened, and the feature genes were identified by the LASSO regression and Boruta algorithm. Survival analysis was performed to filter the potential prognostic signature, and GEPIA was used for preliminary validation. The area under the receiver operating characteristic curve (AUC) was obtained to evaluate discriminatory ability. The Gene Set Variation Analysis (GSVA) was performed, and the composition of the immune cell infiltration in SKCM was estimated using CIBERSORT. At last, paraffin-embedded specimens of primary SKCM and normal skin tissues were collected, and the signature was validated by fluorescence in situ hybridization (FISH) and immunohistochemistry (IHC).ResultsTotally 823 DEGs and 16 feature genes were screened. IFI16 was identified as the signature associated with overall survival of SKCM with a great discriminatory ability (AUC > 0.9 for all datasets). GSVA noticed that IFI16 might be involved in apoptosis and ultraviolet response in SKCM, and immune cell infiltration of IFI16 was evaluated. At last, FISH and IHC both validated the differential expression of IFI16 in SKCM.ConclusionsIn conclusion, our comprehensive analysis identified IFI16 as a signature associated with overall survival and immune infiltration of SKCM, which may play a critical role in the occurrence and development of SKCM.
- Research Article
- 10.1007/s12031-023-02179-y
- Jan 18, 2024
- Journal of molecular neuroscience : MN
Autism spectrum disorder (ASD) is a prevalent neurodevelopmental disorder with a broad spectrum of symptoms and prognoses. Effective therapy requires understanding this variability. ASD children's cognitive and immunological development may depend on iron homoeostasis. This study employs a machine learning model that focuses on iron metabolism hub genes to identify ASD subgroups and describe immune infiltration patterns. A total of 97 control and 148 ASD samples were obtained from the GEO database. Differentially expressed genes (DEGs) and an iron metabolism gene collection achieved the intersection of 25 genes. Unsupervised cluster analysis determined molecular subgroups in individuals with ASD based on 25 genes related to iron metabolism. We assessed gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment, gene set variation analysis (GSVA), and immune infiltration analysis to compare iron metabolism subtype effects. We employed machine learning to identify subtype-predicting hub genes and utilized both training and validation sets to assess gene subtype prediction accuracy. ASD can be classified into two iron-metabolizing molecular clusters. Metabolic enrichment pathways differed between clusters. Immune infiltration showed that clusters differed immunologically. Cluster 2 had better immunological scores and more immune cells, indicating a stronger immune response. Machine learning screening identified SELENBP1 and CAND1 as important genes in ASD's iron metabolism signaling pathway. These genes express in the brain and have AUC values over 0.8, implying significant predictive power. The present study introduces iron metabolism signaling pathway indicators to predict ASD subtypes. ASD is linked to immune cell infiltration and iron metabolism disorders.
- Research Article
3
- 10.1186/s12864-025-11857-7
- Jul 21, 2025
- BMC Genomics
BackgroundThe aim of this study was to investigate the effects of age and sex on the composition of fatty acids, amino acids, and vitamins in the longissimus thoracis et lumborum (LTL) of Huai goats.MethodsThe longissimus thoracis et lumborum tissues of eight Huai goats were collected postslaughter and analyzed for amino acids, fatty acids, and vitamins by gas chromatography‒mass spectrometry (GC‒MS) and standard methods. RNA sequencing (RNA-Seq) was conducted to identify differentially expressed genes (DEGs) across different ages and sexes. Metabolite profiling was performed by liquid chromatography‒mass spectrometry (LC‒MS) to detect and quantify metabolites.ResultsThis study revealed significant differences in amino acid and fatty acid profiles between male and female Huai goats. Compared to females, males presented higher levels of several essential amino acids (e.g., threonine, lysine, and phenylalanine) and nonessential amino acids (e.g., glutamic acid, aspartic acid, and glycine). Additionally, males had higher levels of unsaturated fatty acids such as linoleic acid (C18:2n6c) and arachidonic acid (C20:4n6) compared to those of females. The vitamin B1 content was higher in 6-month-old goats than in 24-month-old goats. Transcriptomic analysis revealed 294 DEGs in the sex group comparison and 580 DEGs in the age group comparison. Key genes involved in collagen production (COL1A1, COL1A2, and COL4A2) and amino acid metabolism (PHGDH and PSAT1) were significantly upregulated in male goats. Functional enrichment analysis revealed significant enrichment in collagen-related Gene Ontology (GO) terms and pathways related to amino acid metabolism. Metabolomic analysis revealed 79 differentially abundant metabolites in the sex group comparison and 69 in the age group comparison, with significant enrichment in pathways related to glycerophospholipid metabolism, oxidative phosphorylation, and amino acid metabolism. Notably, several amino acids and their metabolites, such as lysine, glutamic acid, and serine, exhibited significant differences between male and female goats, which was consistent with the transcriptomic findings.ConclusionThe findings indicate that sex and age significantly influence the chemical composition and flavor profile of Huai goat meat. The identified DEGs and differentially abundant metabolites provide a molecular basis for understanding the variations in meat quality and flavor. These results highlight the potential for optimizing meat production practices to increase the quality and flavor of Huai goat meat.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12864-025-11857-7.
- Research Article
2
- 10.1530/joe-24-0362
- Mar 3, 2025
- The Journal of endocrinology
Thyroid eye disease (TED) features immune infiltration and metabolic dysregulation. Understanding these processes and identifying potential biomarkers are crucial for improving diagnosis and treatment. To this end, immune cell infiltration was analyzed and gene set variation analysis (GSVA) was conducted on the GSE58331 dataset to identify differences between TED and normal tissues. Differentially expressed genes were identified using GSE58331 and GSE105149. Subsequently, a prediction model (TEDML) was developed by combining 113 machine learning algorithms to identify key biomarkers. In addition, enrichment analyses were performed to understand biological functions and pathways involved in TED, and drug sensitivity analyses were conducted to identify potential therapeutic agents. Immune infiltration analysis revealed higher levels of CD4+ Tem, CD4+ Tcm, NKT, NK cells and neutrophils in TED patients compared to controls, with lower levels of macrophages M1 and M2. GSVA indicated significant enrichment in immune-related processes and metabolic pathways. The TEDML model, constructed from the Stepglm[forward] algorithm, demonstrated high accuracy (area under curve of 1 on the training set, 0.893 in validation set), identifying six key genes (CSF3R, ALDH1A1, MXRA5, VSIG4, DPP4 and MDH1). Drug sensitivity analysis suggested that azathioprine and methylprednisolone might be effective at different stages of TED, with CSF3R as a potential therapeutic target. Overall, the TEDML model is accurate and reliable, and the identification of CSF3R as a key biomarker and its correlation with drug sensitivity offers new insights into targeted therapy for TED.
- Research Article
- 10.1371/journal.pone.0319737
- Mar 25, 2025
- PloS one
Systemic lupus erythematosus (SLE) is a complex autoimmune disease that has significant impacts on patients' quality of life and poses a substantial economic burden on society. This study aimed to elucidate the molecular mechanisms underlying SLE by analyzing glucocorticoid-related genes (GRGs) expression profiles. We examined the expression profiles of GRGs in SLE and performed consensus clustering analysis to identify stable patient clusters. We also identified differentially expressed genes (DEGs) within the clusters and between SLE patients and healthy controls. We conducted Gene Set Enrichment Analysis (GSEA) and Gene Set Variation Analysis (GSVA) to investigate biological functional differences, and we also conducted CIBERSORTx to estimate the number of immune cells. Furthermore, we utilized least absolute shrinkage and selection operator (LASSO) regression and Random Forest (RF) algorithms to screen for hub genes. We then validated the expression of these hub genes and constructed nomograms for further validation. Moreover, we employed single-sample Gene Set Enrichment Analysis (ssGSEA) to analyze immune infiltration. We also constructed an RNA-binding protein (RBP)-mRNA network and conducted drug sensitivity analysis along with molecular docking studies. Patients with SLE were divided into two subclusters, revealing a total of 2,681 DEGs. Among these, 1,458 genes were upregulated, while 1,223 were downregulated in cluster_1. GSVA showed significant changes in the pathways associated with cluster_1. Immune infiltration analysis revealed high levels of monocyte in all samples, with greater infiltration of various immune cells in cluster_1. A comparison of SLE patients to control subjects identified 269 DEGs, which were enriched in several pathways. Hub genes, including PTX3, DYSF and F2R, were selected through LASSO and RF methods, resulting in a well-performing diagnostic model. Drug sensitivity and docking studies suggested F2R as a potential new therapeutic target. PTX3, DYSF and F2R are potentially linked to SLE and are proposed as new molecular markers for its onset and progression. Additionally, monocyte infiltration plays a crucial role in advancing SLE.
- Research Article
1
- 10.21037/tcr-24-838
- Feb 1, 2025
- Translational cancer research
Breast cancer (BC) is a common tumor among women and is a heterogeneous disease with many subtypes. Each subtype shows different clinical presentations, disease trajectories and prognoses, and different responses to neoadjuvant therapy; thus, a new and universal prognostic biomarker for BC patients is urgently needed. Our goal is to identify a novel prognostic molecular biomarker that can accurately predict the outcome of all BC subtypes and guide their clinical management. Utilizing data from The Cancer Genome Atlas (TCGA), we analyzed differential gene expression and patient clinical data. Weighted gene coexpression network analysis (WGCNA), Cox univariate regression and least absolute shrinkage and selection operator (LASSO) analysis were used to construct a prognostic model; the differential expression of the core genes in this model was validated via real-time quantitative polymerase chain reaction (RT-qPCR), and the reliability of the predictive model was validated in both an internal cohort and a BC patient dataset from the Gene Expression Omnibus (GEO) database. Further studies, such as gene set variation analysis (GSVA) and gene set enrichment analysis (GSEA), were performed to investigate the enrichment of signaling pathways. The CIBERSORT algorithm was used to estimate immune infiltration and tumor mutation burden (TMB), and drug sensitivity analysis was performed to evaluate the treatment response. A total of 1,643 differentially expressed genes were identified. After WGCNA and Cox regression combined with LASSO analysis, 15 genes were identified by screening and used to establish a prognostic gene signature. Further analysis revealed that the epithelial-mesenchymal transition (EMT) pathway gene signature was enriched in these genes. Each patient was assigned a risk score, and according to the median risk score, patients were classified into a high-risk group or a low-risk group. The prognosis of the low-risk group was better than that of the high-risk group (P<0.01), and analyses of two independent GEO validation cohorts yielded similar results. Furthermore, a nomogram was constructed and found to perform well in predicting prognosis. GSVA revealed that the EMT pathway, transforming growth factor β (TGF-β) signaling pathway and PI3K-Akt signaling pathway genes were enriched in the high-risk group, and the Wnt-β-catenin signaling pathway, DNA repair pathway and P53 pathway gene sets were enriched in the low-risk group. GSEA revealed genes related to TGF-β signaling and the PI3K-Akt signaling pathways were enriched in the high-risk group. CIBERSORT demonstrated that the low-risk group had greater infiltration of antitumor immune cells. The TMB and drug sensitivity results suggested that immunotherapy and chemotherapy are likely to be more effective in the low-risk group. We established a new EMT pathway-related prognostic gene signature that can be used to effectively predict BC prognosis and treatment response.
- Research Article
- 10.2147/pgpm.s488143
- Nov 19, 2024
- Pharmacogenomics and Personalized Medicine
BackgroundChronic kidney disease (CKD) involves complex immune dysregulation and altered gene expression profiles. This study investigates immune cell infiltration, differential gene expression, and pathway enrichment in CKD patients to identify key diagnostic biomarkers through machine learning methods.MethodsWe assessed immune cell infiltration and immune microenvironment scores using the xCell algorithm. Differentially expressed genes (DEGs) were identified using the limma package. Gene Set Enrichment Analysis (GSEA) and Gene Set Variation Analysis (GSVA) were performed to evaluate pathway enrichment. Machine learning techniques (LASSO and Random Forest) pinpointed diagnostic genes. A nomogram model was constructed and validated for diagnostic prediction. Spearman correlation explored associations between key genes and immune cell recruitment.ResultsThe CKD group exhibited significantly altered immune cell infiltration and increased immune microenvironment scores compared to the normal group. We identified 2335 DEGs, including 124 differentially expressed immune-related genes. GSEA highlighted significant enrichment of inflammatory and immune pathways in the high immune score (HIS) subgroup, while GSVA indicated upregulation of immune responses and metabolic processes in HIS. Machine learning identified four key diagnostic genes: RGS1, IL4I1, NR4A3, and SOCS3. Validation in an independent dataset (GSE96804) and clinical samples confirmed their diagnostic potential. The nomogram model integrating these genes demonstrated high predictive accuracy. Spearman correlation revealed positive associations between the key genes and various immune cells, indicating their roles in immune modulation and CKD pathogenesis.ConclusionThis study provides a comprehensive analysis of immune alterations and gene expression profiles in CKD. The identified diagnostic genes and the constructed nomogram model offer potent tools for CKD diagnosis. The immunomodulatory roles of RGS1, IL4I1, NR4A3, and SOCS3 warrant further investigation as potential therapeutic targets in CKD.
- Research Article
3
- 10.1097/md.0000000000032861
- Feb 10, 2023
- Medicine
Previous studies have shown that asthma is a risk factor for lung cancer, while the mechanisms involved remain unclear. We attempted to further explore the association between asthma and non-small cell lung cancer (NSCLC) via bioinformatics analysis. We obtained GSE143303 and GSE18842 from the GEO database. Lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) groups were downloaded from the TCGA database. Based on the results of differentially expressed genes (DEGs) between asthma and NSCLC, we determined common DEGs by constructing a Venn diagram. Enrichment analysis was used to explore the common pathways of asthma and NSCLC. A protein-protein interaction (PPI) network was constructed to screen hub genes. KM survival analysis was performed to screen prognostic genes in the LUAD and LUSC groups. A Cox model was constructed based on hub genes and validated internally and externally. Tumor Immune Estimation Resource (TIMER) was used to evaluate the association of prognostic gene models with the tumor microenvironment (TME) and immune cell infiltration. Nomogram model was constructed by combining prognostic genes and clinical features. 114 common DEGs were obtained based on asthma and NSCLC data, and enrichment analysis showed that significant enrichment pathways mainly focused on inflammatory pathways. Screening of 5 hub genes as a key prognostic gene model for asthma progression to LUAD, and internal and external validation led to consistent conclusions. In addition, the risk score of the 5 hub genes could be used as a tool to assess the TME and immune cell infiltration. The nomogram model constructed by combining the 5 hub genes with clinical features was accurate for LUAD. Five-hub genes enrich our understanding of the potential mechanisms by which asthma contributes to the increased risk of lung cancer.
- Research Article
7
- 10.1080/15257770.2024.2310044
- Jan 27, 2024
- Nucleosides, Nucleotides & Nucleic Acids
Nonalcoholic fatty liver disease (NAFLD) is a spectrum of chronic liver disease characterized. The condition ranges from isolated excessive hepatocyte triglyceride accumulation and steatosis (nonalcoholic fatty liver (NAFL), to hepatic triglyceride accumulation plus inflammation and hepatocyte injury (nonalcoholic steatohepatitis (NASH)) and finally to hepatic fibrosis and cirrhosis and/or hepatocellular carcinoma (HCC). However, the mechanism driving this process is not yet clear. Obtain sample microarray from the GEO database. Extract 6 healthy liver samples, 74 nonalcoholic hepatitis samples, 8 liver cirrhosis samples, and 53 liver cancer samples from the GSE164760 dataset. We used the GEO2R tool for differentially expressed genes (DEGs) analysis of disease progression (nonalcoholic hepatitis healthy group, cirrhosis nonalcoholic hepatitis group, and liver cancer cirrhosis group) and necroptosis gene set. Gene set variation analysis (GSVA) is used to evaluate the association between biological pathways and gene features. The STRING database and Cytoscape software were used to establish and visualize protein-protein interaction (PPI) networks and identify the key functional modules of DEGs, drawn factor-target genes regulatory network. Gene Ontology (GO) and KEGG pathway enrichment analyses of DEGs were also performed. Additionally, immune infiltration patterns were analyzed using the cibersort, and the correlation between immune cell-type abundance and DEGs expression was investigated. We further screened and obtained a total of 152 intersecting DEGs from three groups. 23 key genes were obtained through the MCODE plugin. Transcription factors regulating common differentially expressed genes were obtained in the hTFtarget database, and a TF target network diagram was drawn. There are 118 nodes, 251 edges, and 4 clusters in the PPI network. The key genes of the four modules include METAP2, RPL14, SERBP1, EEF2; HR4A1; CANX; ARID1A, UBE2K. METAP2, RPL14, SERBP1 and EEF2 was identified as the key hub genes. CREB1 was identified as the hub TF interacting with those gens by taking the intersection of potential TFs. The types of key gene changes were genetic mutations. It can be seen that the incidence of key gene mutations is 1.7% in EEF2, 0.8% in METAP2, and 0.3% in RPL14, respectively. Finally, We found that the most significant expression differences of the immune infiltrating cells among the three groups, were Tregs and M2, M0 type macrophages. We identified four hub genes METAP2, RPL14, SERBP1 and EEF2 being the most closely with the process from NASH to cirrhosis to HCC. It is beneficial to examine and understand the interaction between hub DEGs and potential regulatory molecules in the process. This knowledge may provide a novel theoretical foundation for the development of diagnostic biomarkers and gene-related therapy targets in the process.