Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Machine Learning Approaches to Influential ROI Selection in Parkinson’s Disease: A Comparative Analysis of LASSO, Recursive Feature Elimination, and Random Forest

  • TL;DR
  • Abstract
  • Literature Map
  • Similar Papers
TL;DR

This study compares LASSO, RF, and RFE machine learning methods for selecting influential brain regions from fMRI data to distinguish Parkinson’s disease patients from controls. LASSO achieved the highest classification performance with an AUC of 0.96, sensitivity and specificity of 0.92, and identified 9 key ROIs, outperforming RF and RFE significantly (P < 0.001), highlighting its potential for early PD detection.

Abstract
Translate article icon Translate Article Star icon

Background: Identifying key brain regions implicated in Parkinson’s disease (PD) can enhance both diagnostic accuracy and our understanding of disease mechanisms. Objectives: This study aims to compare three machine learning methods — least absolute shrinkage and selection operator (LASSO), random forest (RF), and recursive feature elimination (RFE) — for selecting influential regions of interest (ROIs) from functional magnetic resonance imaging (fMRI) data to distinguish PD patients from healthy controls. Methods: This retrospective analysis used fMRI data from 15 patients with PD and 15 matched healthy controls, sourced from an open-access database. Three machine learning approaches were applied to identify significant ROIs associated with PD. The selected ROIs were subsequently evaluated using logistic regression models, assessing classification performance through area under the curve (AUC), sensitivity, and specificity. A comparative analysis of model performance was conducted using DeLong’s test. Results: The LASSO identified 9 ROIs, RF selected 10, and RFE identified 4 key ROIs. Logistic regression models constructed with these ROIs yielded AUC values of 0.96, 0.94, and 0.88 for LASSO, RF, and RFE, respectively. Both sensitivity and specificity were highest for LASSO (0.92 for both). DeLong’s test revealed statistically significant differences among the methods (P &lt; 0.001), with LASSO outperforming RF and RFE. Conclusions: This study demonstrates that LASSO, RFE, and RF machine learning techniques are promising for identifying key brain regions, showing preliminary alignment with clinical observations. Focusing on patients with PD, it highlights regions associated with executive function, memory, motor skills, and sensory processing. Early detection of abnormal connectivity in these areas may potentially inform exploratory preventive strategies for PD.

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3390/cancers16112163
Differential Diagnosis of Prostate Cancer Grade to Augment Clinical Diagnosis Based on Classifier Models with Tuned Hyperparameters.
  • Jun 6, 2024
  • Cancers
  • Saleh T Alanezi + 3 more

We developed a novel machine-learning algorithm to augment the clinical diagnosis of prostate cancer utilizing first and second-order texture analysis metrics in a novel application of machine-learning radiomics analysis. We successfully discriminated between significant prostate cancers versus non-tumor regions and provided accurate prediction between Gleason score cohorts with statistical sensitivity of 0.82, 0.81 and 0.91 in three separate pathology classifications. Tumor heterogeneity and prediction of the Gleason score were quantified using two feature selection approaches and two separate classifiers with tuned hyperparameters. There was a total of 71 patients analyzed in this study. Multiparametric MRI, incorporating T2WI and ADC maps, were used to derive radiomics features. Recursive feature elimination (RFE), the least absolute shrinkage and selection operator (LASSO), and two classification approaches, incorporating a support vector machine (SVM) (with randomized search) and random forest (RF) (with grid search), were utilized to differentiate between non-tumor regions and significant cancer while also predicting the Gleason score. In T2WI images, the RFE feature selection approach combined with RF and SVM classifiers outperformed LASSO with SVM and RF classifiers. The best performance was achieved by combining LASSO and SVM into a model that used both T2WI and ADC images. This model had an area under the curve (AUC) of 0.91. Radiomic features computed from ADC and T2WI images were used to predict three groups of Gleason score using two kinds of feature selection methods (RFE and LASSO), RF and SVM classifier models with tuned hyperparameters. Using combined sequences (T2WI and ADC map images) and combined radiomics (1st and GLCM features), LASSO, with a feature selection method with RF, was able to predict G3 with the highest sensitivity at a level AUC of 0.92. To predict G3 for single sequence (T2WI images) using GLCM features, LASSO with SVM achieved the highest sensitivity with an AUC of 0.92.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 21
  • 10.1371/journal.pone.0296625
Predicting and identifying factors associated with undernutrition among children under five years in Ghana using machine learning algorithms.
  • Feb 13, 2024
  • PLOS ONE
  • Eric Komla Anku + 1 more

Undernutrition among children under the age of five is a major public health concern, especially in developing countries. This study aimed to use machine learning (ML) algorithms to predict undernutrition and identify its associated factors. Secondary data analysis of the 2017 Multiple Indicator Cluster Survey (MICS) was performed using R and Python. The main outcomes of interest were undernutrition (stunting: height-for-age (HAZ) < -2 SD; wasting: weight-for-height (WHZ) < -2 SD; and underweight: weight-for-age (WAZ) < -2 SD). Seven ML algorithms were trained and tested: linear discriminant analysis (LDA), logistic model, support vector machine (SVM), random forest (RF), least absolute shrinkage and selection operator (LASSO), ridge regression, and extreme gradient boosting (XGBoost). The ML models were evaluated using the accuracy, confusion matrix, and area under the curve (AUC) receiver operating characteristics (ROC). In total, 8564 children were included in the final analysis. The average age of the children was 926 days, and the majority were females. The weighted prevalence rates of stunting, wasting, and underweight were 17%, 7%, and 12%, respectively. The accuracies of all the ML models for wasting were (LDA: 84%; Logistic: 95%; SVM: 92%; RF: 94%; LASSO: 96%; Ridge: 84%, XGBoost: 98%), stunting (LDA: 86%; Logistic: 86%; SVM: 98%; RF: 88%; LASSO: 86%; Ridge: 86%, XGBoost: 98%), and for underweight were (LDA: 90%; Logistic: 92%; SVM: 98%; RF: 89%; LASSO: 92%; Ridge: 88%, XGBoost: 98%). The AUC values of the wasting models were (LDA: 99%; Logistic: 100%; SVM: 72%; RF: 94%; LASSO: 99%; Ridge: 59%, XGBoost: 100%), for stunting were (LDA: 89%; Logistic: 90%; SVM: 100%; RF: 92%; LASSO: 90%; Ridge: 89%, XGBoost: 100%), and for underweight were (LDA: 95%; Logistic: 96%; SVM: 100%; RF: 94%; LASSO: 96%; Ridge: 82%, XGBoost: 82%). Age, weight, length/height, sex, region of residence and ethnicity were important predictors of wasting, stunting and underweight. The XGBoost model was the best model for predicting wasting, stunting, and underweight. The findings showed that different ML algorithms could be useful for predicting undernutrition and identifying important predictors for targeted interventions among children under five years in Ghana.

  • Research Article
  • Cite Count Icon 48
  • 10.1007/s00330-020-06768-y
Classification of pulmonary lesion based on multiparametric MRI: utility of radiomics and comparison of machine learning methods.
  • Mar 28, 2020
  • European Radiology
  • Xinhui Wang + 4 more

We develop and validate a radiomics model based on multiparametric magnetic resonance imaging (MRI) in the classification of the pulmonary lesion and identify optimal machine learning methods. This retrospective analysis included 201 patients (143 malignancies, 58 benign lesions). Radiomics features were extracted from multiparametric MRI, including T2-weighted imaging (T2WI), T1-weighted imaging (TIWI), and apparent diffusion coefficient (ADC) map. Three feature selection methods, including recursive feature elimination (RFE), t test, and least absolute shrinkage and selection operator (LASSO), and three classification methods, including linear discriminate analysis (LDA), support vector machine (SVM), and random forest (RF) were used to distinguish benign and malignant pulmonary lesions. Performance was compared by AUC, sensitivity, accuracy, precision, and specificity. Analysis of performance differences in three randomly drawn cross-validation sets verified the stability of the results. For most single MR sequences or combinations of multiple MR sequences, RFE feature selection method with SVM classifier had the best performance, followed by RFE with RF. The radiomics model based on multiple sequences showed a higher diagnostic accuracy than single sequence for every machine learning method. Using RFE with SVM, the joint model of T1WI, T2WI, and ADC showed the highest performance with AUC = 0.88 ± 0.02 (sensitivity 83%; accuracy 82%; precision 91%; specificity 79%) in test set. Quantitative radiomics features based on multiparametric MRI have good performance in differentiating lung malignancies and benign lesions. The machine learning method of RFE with SVM is superior to the combination of other feature selection and classifier methods. • Radiomics approach has the potential to distinguish between benign and malignant pulmonary lesions. • Radiomics model based on multiparametric MRI has better performance than single-sequence models. • The machine learning methods RFE with SVM perform best in the current cohort.

  • Research Article
  • Cite Count Icon 2
  • 10.1186/s12891-025-08619-7
Screening risk factors for the occurrence of wedge effects in intramedullary nail fixation for intertrochanteric fractures in older people via machine learning and constructing a prediction model: a retrospective study
  • Apr 22, 2025
  • BMC Musculoskeletal Disorders
  • Zhe Xu + 7 more

PurposeThe wedge effect (V-effect) is a common complication in intramedullary nailing surgery for intertrochanteric fractures and can significantly affect postoperative outcomes. The purpose of this study was to screen risk factors for the intraoperative V-effect in intertrochanteric fractures and to develop a clinical prediction model.MethodsA total of 319 patients (77 patients who developed V-effects) from China were randomly divided into a training set (n = 223) and a validation set (n = 96) at a ratio of 7:3. The variables were screened via 3 machine learning methods, including least absolute shrinkage and selection operator (LASSO) regression, the Boruta algorithm, and recursive feature elimination (RFE). Variables that appeared in the three machine learning methods were included in multivariate logistic regression to construct predictive models. Spearman correlation analysis was used to exclude covariance between variables. Restricted cubic splines (RCSs) were used to analyze the relationships among femoral lateral wall thickness, BMI, and the V effect. The differentiation, calibration and clinical applicability of the model were assessed, and the reasonability of the model was analyzed.ResultsMachine learning identified 8 variables that appeared in these 3 machine learning methods, and the covariance between these 8 variables was excluded (r < 0.6). BMI, surgical experience, a lesser trochanteric fracture, the thickness of the lateral wall, the insertion point, bone density, fracture classification, and holiday surgery were found to be risk factors for the occurrence of the V-effect via multivariate logistic regression. The RCS analysis revealed that the lateral wall thickness, BMI, and occurrence of the V effect were linearly related. The final predictive model had good differentiation, calibration and clinical applicability, and it had better predictive efficacy than the other models did.ConclusionThis study employed three machine learning variable selection methods—the LASSO, RFE, and Boruta algorithms—to construct a V-effect predictive model. The model enables orthopedic surgeons to better understand the risk factors associated with the V-effect and provides a reference for surgeons to implement appropriate measures to reduce the incidence of the V-effect.

  • Research Article
  • 10.30476/ijms.2025.105971.4207
Radiomics-Driven Machine Learning Models for Diagnosis of Pancreatic Adenocarcinoma.
  • Mar 1, 2026
  • Iranian journal of medical sciences
  • Amin Talebi + 5 more

Pancreatic adenocarcinoma is one of the most aggressive and lethal cancers, with a poor prognosis primarily due to late-stage diagnosis. Improving the accuracy of pancreatic cancer diagnosis is crucial for enhancing survival outcomes, yet the sensitivity of conventional diagnostic methods remains a significant challenge. This study aims to evaluate the effectiveness of radiomics features extracted from Computed Tomography (CT) imaging, combined with machine learning models, for the detection of pancreatic adenocarcinoma. A retrospective dataset from Baqiyatallah Hospital, Tehran, Iran (2024) of 100 participants (50 with pancreatic adenocarcinoma (primarily stages II-III) and 50 healthy controls) was used. CT images were acquired with a three-phase protocol, and radiomics features were extracted using 3D Slicer software. Three classifiers-Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF)-were employed, with feature selection methods including Recursive Feature Elimination (RFE), Mutual Information (MI), and Least Absolute Shrinkage and Selection Operator (LASSO). Model performance was assessed using accuracy, precision, sensitivity, F1 score, and area under the curve (AUC). The SVM classifier with LASSO feature selection achieved the highest performance, with an accuracy of 0.83 and an AUC of 0.89. LR and RF also demonstrated strong results, with LASSO providing the best feature selection for both classifiers. SHAP analysis revealed that textural features such as gray-level-non-uniformity and run-length-non-uniformity were the most important drivers for distinguishing pancreatic cancer from normal tissue. Radiomics-based machine learning models show promise for improving the diagnosis of pancreatic adenocarcinoma. The combination of LASSO and powerful classifiers such as SVM, LR, and RF offers a robust framework for non-invasive, accurate diagnostic tools.

  • Research Article
  • Cite Count Icon 1
  • 10.1002/brb3.70770
A Comprehensive Framework for Parkinson's Disease Detection Using Spiral Drawings and Advanced Machine Learning Techniques.
  • Aug 1, 2025
  • Brain and behavior
  • Mohamed J Saadh + 9 more

This study aims to create a reliable and scalable framework for detecting Parkinson's disease (PD) using spiral drawings. It integrates advanced machine learning techniques to improve diagnostic accuracy and practical application in clinical settings. Spiral drawing data were collected from a comprehensive dataset, including samples from both Parkinson's patients and healthy individuals. Three deep learning models-ResNet50, VGG16, and EfficientNetB0-were used to extract detailed patterns from the drawings. To enhance model performance, four feature selection techniques were applied: Principal Component Analysis (PCA), Recursive Feature Elimination (RFE), Least Absolute Shrinkage and Selection Operator (LASSO), and ANOVA. Six different classifiers (Support Vector Machine [SVM], Random Forest [RF], Multi-Layer Perceptron [MLP], XGBoost, CatBoost, and voting classifiers) were tested. The system's diagnostic accuracy was measured using four metrics: accuracy, sensitivity, F1-score, and AUC-ROC. Heatmaps and ROC curves were created to visualize the results. The models achieved high classification performance with different configurations. For example, ResNet50 with PCA and MLP reached the highest accuracy (98%) and AUC-ROC (97%). Similarly, SVM with PCA achieved accuracy (92%) and AUC-ROC (98%). For VGG16, combining LASSO with XGBoost resulted in high F1-scores (90%) and AUC-ROC (93%), while the voting classifiers with PCA achieved an AUC-ROC of 98%. EfficientNetB0 combined with RFE and XGBoost delivered exceptional accuracy (98%) with robust overall metrics. CatBoost with LASSO achieved balanced performance, showing high sensitivity (89%) and AUC-ROC (96%). Ensemble methods, like voting classifiers, consistently provided strong AUC-ROC values but showed variability in accuracy and sensitivity compared to individual classifiers like MLP and SVM. The study demonstrated that combining advanced techniques for feature extraction, selection, and classification can significantly improve PD detection accuracy. Future research should focus on integrating multiple data sources and exploring real-time applications to enhance scalability and clinical utility.

  • Research Article
  • 10.1164/ajrccm.2025.211.abstracts.a5683
Machine Learning-Based Prediction of Clinical Improvement in COVID-19 Pneumonia Patients at Hospital Admission: A Secondary Analysis of a Randomized Clinical Trial
  • May 1, 2025
  • American Journal of Respiratory and Critical Care Medicine
  • P.L Silva + 7 more

Background: Predicting clinical improvement in COVID-19 patients following hospital admission is crucial for optimal resource allocation. Machine learning can help identify patients likely to improve based on real-world data. In this study, we applied two approaches—the least absolute shrinkage and selection operator (LASSO) and combiROC—to select predictive variables available at hospital admission that could forecast clinical improvement after 7 days. Methods: In this secondary analysis of the placebo group from a previous modified intention-to-treat (mITT) randomized controlled trial (NCT04561219) in COVID-19 patients, we evaluated clinical, laboratory, and blood markers at hospital admission to predict clinical improvement after 7 days. Clinical improvement was defined as an increase of at least 2 points on the World Health Organization (WHO) scale. Machine learning methods, including LASSO and combiROC, were used to identify the most predictive variables. The optimal threshold for different marker combinations was determined using the Youden criterion. After establishing this threshold, we compared all combinations based on the highest area under the curve (AUC) and accuracy in predicting clinical improvement. AUCs were compared using DeLong's algorithm. Results: Overall, 203 patients were included in the analysis, and they were divided into clinical improvement (n=154) and no clinical improvement (n=49). The three predictive variables identified by LASSO—SaO2, hematocrit, and IL-13—demonstrated high sensitivity at 98% [95% CI: 92% – 100%] but low specificity at 26% [95% CI: 10% – 48%] for predicting clinical improvement. In contrast, the combiROC method, which selected additional variables (CTACK, Hb, HGF, hematocrit, IL-3, PDGF-BB, RANTES, SaO2), achieved a more balanced sensitivity of 82% [95% CI: 69% – 91%] and specificity of 74% [95% CI: 49% – 91%] for clinical improvement. The accuracy between LASSO and combiROC was similar at 82% and 80%, respectively, as were the AUCs of their ROC curves [0.704, 95% CI: 0.571 – 0.837 for LASSO, and 0.823, 95% CI: 0.708 – 0.937 for combiROC; p=0.185]. Conclusion: In hospitalized COVID-19 pneumonia patients, LASSO and combiROC analyses identified variables with comparable accuracy and AUCs for predicting clinical improvement. LASSO, using only SaO2, hematocrit, and IL-13, achieved high sensitivity but low specificity, whereas combiROC, with a broader variable selection, offered balanced sensitivity and specificity for predicting improvement.

  • Research Article
  • Cite Count Icon 25
  • 10.1089/met.2019.0104
Logistic LASSO and Elastic Net to Characterize Vitamin D Deficiency in a Hypertensive Obese Population.
  • Jan 13, 2020
  • Metabolic Syndrome and Related Disorders
  • Rafael Garcia-Carretero + 6 more

Aim: The primary objective of our research was to compare the performance of data analysis to predict vitamin D deficiency using three different regression approaches and to evaluate the usefulness of incorporating machine learning algorithms into the data analysis in a clinical setting. Methods: We included 221 patients from our hypertension unit, whose data were collected from electronic records dated between 2006 and 2017. We used classical stepwise logistic regression, and two machine learning methods [least absolute shrinkage and selection operator (LASSO) and elastic net]. We assessed the performance of these three algorithms in terms of sensitivity, specificity, misclassification error, and area under the curve (AUC). Results: LASSO and elastic net regression performed better than logistic regression in terms of AUC, which was significantly better in both penalized methods, with AUC = 0.76 and AUC = 0.74 for elastic net and LASSO, respectively, than in logistic regression, with AUC = 0.64. In terms of misclassification rate, elastic net (18%) outperformed LASSO (22%) and logistic regression (25%). Conclusion: Compared with a classical logistic regression approach, penalized methods were found to have better performance in predicting vitamin D deficiency. The use of machine learning algorithms such as LASSO and elastic net may significantly improve the prediction of vitamin D deficiency in a hypertensive obese population.

  • Research Article
  • Cite Count Icon 40
  • 10.1016/j.msard.2021.102989
Prediction of unenhanced lesion evolution in multiple sclerosis using radiomics-based models: a machine learning approach
  • May 4, 2021
  • Multiple Sclerosis and Related Disorders
  • Yuling Peng + 10 more

Prediction of unenhanced lesion evolution in multiple sclerosis using radiomics-based models: a machine learning approach

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.3389/fnins.2022.1035153
Combined brain network topological metrics with machine learning algorithms to identify essential tremor.
  • Nov 2, 2022
  • Frontiers in Neuroscience
  • Qin Li + 15 more

Essential tremor (ET) is a common movement syndrome, and the pathogenesis mechanisms, especially the brain network topological changes in ET are still unclear. The combination of graph theory (GT) analysis with machine learning (ML) algorithms provides a promising way to identify ET from healthy controls (HCs) at the individual level, and further help to reveal the topological pathogenesis in ET. Resting-state functional magnetic resonance imaging (fMRI) data were obtained from 101 ET and 105 HCs. The topological properties were analyzed by using GT analysis, and the topological metrics under every single threshold and the area under the curve (AUC) of all thresholds were used as features. Then a Mann-Whitney U-test and least absolute shrinkage and selection operator (LASSO) were conducted to feature dimensionality reduction. Four ML algorithms were adopted to identify ET from HCs. The mean accuracy, mean balanced accuracy, mean sensitivity, mean specificity, and mean AUC were used to evaluate the classification performance. In addition, correlation analysis was carried out between selected topological features and clinical tremor characteristics. All classifiers achieved good classification performance. The mean accuracy of Support vector machine (SVM), logistic regression (LR), random forest (RF), and naïve bayes (NB) was 84.65, 85.03, 84.85, and 76.31%, respectively. LR classifier achieved the best classification performance with 85.03% mean accuracy, 83.97% sensitivity, and an AUC of 0.924. Correlation analysis results showed that 2 topological features negatively and 1 positively correlated with tremor severity. These results demonstrated that combining topological metrics with ML algorithms could not only achieve high classification accuracy for discrimination ET from HCs but also help us to reveal the potential topological pathogenesis of ET.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 20
  • 10.3390/brainsci13020175
Parkinson’s Disease Gene Biomarkers Screened by the LASSO and SVM Algorithms
  • Jan 20, 2023
  • Brain Sciences
  • Yiwen Bao + 4 more

Parkinson’s disease (PD) is a common progressive neurodegenerative disorder. Various evidence has revealed the possible penetration of peripheral immune cells in the substantia nigra, which may be essential for PD. Our study uses machine learning (ML) to screen for potential PD genetic biomarkers. Gene expression profiles were screened from the Gene Expression Omnibus (GEO). Differential expression genes (DEGs) were selected for the enrichment analysis. A protein–protein interaction (PPI) network was built with the STRING database (Search Tool for the Retrieval of Interacting Genes), and two ML approaches, namely least absolute shrinkage and selection operator (LASSO) and support vector machine recursive feature elimination (SVM-RFE), were employed to identify candidate genes. The external validation dataset further tested the expression degree and diagnostic value of candidate biomarkers. To assess the validity of the diagnosis, we determined the receiver operating characteristic (ROC) curve. A convolution tool was employed to evaluate the composition of immune cells by CIBERSORT, and we performed correlation analyses on the basis of the training dataset. Twenty-seven DEGs were screened in the PD and control samples. Our results from the enrichment analysis showed a close association with inflammatory and immune-associated diseases. Both the LASSO and SVM algorithms screened eight and six characteristic genes. AGTR1, GBE1, TPBG, and HSPA6 are overlapping hub genes strongly related to PD. Our results of the area under the ROC (AUC), including AGTR1 (AUC = 0.933), GBE1 (AUC = 0.967), TPBG (AUC = 0.767), and HSPA6 (AUC = 0.633), suggested that these genes have good diagnostic value, and these genes were significantly associated with the degree of immune cell infiltration. AGTR1, GBE1, TPBG, and HSPA6 were identified as potential biomarkers in the diagnosis of PD and provide a novel viewpoint for further study on PD immune mechanism and therapy.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 14
  • 10.3389/fphar.2022.834743
Identifying Patients at Risk of Acute Kidney Injury Among Medicare Beneficiaries With Type 2 Diabetes Initiating SGLT2 Inhibitors: A Machine Learning Approach.
  • Mar 11, 2022
  • Frontiers in Pharmacology
  • Lanting Yang + 6 more

Introduction: To predict acute kidney injury (AKI) risk in patients with type 2 diabetes (T2D) prescribed sodium-glucose cotransporter two inhibitors (SGLT2i). Methods: Using a 5% random sample of Medicare claims data, we identified 17,694 patients who filled ≥1 prescriptions for canagliflozin, dapagliflozin and empagliflozin in 2013–2016. The cohort was split randomly and equally into training and testing sets. We measured 65 predictor candidates using claims data from the year prior to SGLT2i initiation. We then applied three machine learning models, including random forests (RF), elastic net and least absolute shrinkage and selection operator (LASSO) for risk prediction. Results: The incidence rate of AKI was 1.1% over a median 1.5 year follow up. Among three machine learning methods, RF produced the best prediction (C-statistic = 0.72), followed by LASSO and elastic net (both C-statistics = 0.69). Among individuals classified in the top 10% of the RF risk score (i.e., high risk group), the actual incidence rate of AKI was as high as 3.7%. In the logistic regression model including 14 important risk factors selected by LASSO, use of loop diuretics [adjusted odds ratio (95% confidence interval): 3.72 (2.44–5.76)] had the strongest association with AKI incidence. Disscusion: Our machine learning model efficiently identified patients at risk of AKI among Medicare beneficiaries with T2D undergoing SGLT2i treatment.

  • Research Article
  • Cite Count Icon 1
  • 10.11817/j.issn.1672-7347.2024.230307
基于WGCNA和机器学习筛选帕金森病免疫相关关键基因
  • Feb 28, 2024
  • Journal of Central South University Medical Sciences
  • 一铭 黄 + 6 more

目的在帕金森病的发病过程中,免疫系统的异常激活和炎症反应起着重要作用。然而,目前对于免疫相关关键基因在帕金森病发生和发展中的具体作用和作用机制的了解仍然有限。本研究旨在通过加权基因共表达网络分析(weighted gene co-expression network analysis,WGCNA)和机器学习筛选帕金森病免疫相关关键基因。方法从基因表达综合(Gene Expression Omnibus,GEO)数据库下载基因芯片数据,采用WGCNA筛选出与帕金森病相关的重要基因模块;将重要模块中的基因导出,绘制帕金森病重要相关基因与免疫相关基因的韦恩图,从而筛选出帕金森病免疫相关基因。采用基因本体(gene ontology,GO)分析和京都基因和基因组百科全书(Kyoto Encyclopedia of Genes and Genomes,KEGG)深入分析免疫相关基因的功能及参与的信号通路。通过R语言的CIBERSORT包进行免疫细胞浸润分析。采用生物信息学方法和3种机器学习方法[最小绝对收缩和选择算子(least absolute shrinkage and selection operator,LASSO)回归、随机森林(random forest,RF)和支持向量机(support vector machine,SVM)]对筛选出的帕金森病免疫相关基因进行进一步筛选研究,绘制4种方法筛选的差异表达基因的韦恩图,筛选交集基因即中心节点(hub node,hub)基因。通过STRING数据库搜索帕金森病hub基因的下游蛋白质,绘制蛋白质互作网络图。结果筛选出帕金森病重要模块基因中与免疫相关的基因218个,其中45个为上调基因,50个为下调基因。富集分析结果显示218个基因主要在免疫系统对外来物反应和病毒感染通路富集。免疫浸润分析结果表明,CD4+ T细胞、NK细胞、CD8+ T细胞、B细胞在帕金森病患者样本中的浸润百分率较高,静息NK细胞、静息CD4+ T细胞在帕金森病患者样本中显著浸润。4种方法筛选出的hub基因为ANK1基因。交集基因蛋白质互作网络分析结果显示,ANK1基因翻译表达的11个蛋白质主要参与信号转导、铁稳态调节及免疫系统激活等功能。结论通过WGCNA和机器学习方法,筛选出帕金森病免疫相关关键基因ANK1,该基因可能成为帕金森病诊断和治疗的候选靶点。

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 9
  • 10.3389/fmolb.2022.913325
Machine learning based on metabolomics reveals potential targets and biomarkers for primary Sjogren's syndrome.
  • Sep 5, 2022
  • Frontiers in Molecular Biosciences
  • Kai Wang + 4 more

Background: Using machine learning based on metabolomics, this study aimed to construct an effective primary Sjogren’s syndrome (pSS) diagnostics model and reveal the potential targets and biomarkers of pSS. Methods: From a total of 39 patients with pSS and 38 healthy controls (HCs), serum specimens were collected. The samples were analyzed by ultra-high-performance liquid chromatography coupled with high-resolution mass spectrometry. Three machine learning algorithms, including the least absolute shrinkage and selection operator (LASSO), random forest (RF), and extreme gradient boosting (XGBoost), were used to build the pSS diagnosis models. Afterward, four machine learning methods were used to reduce the dimensionality of the metabolomics data. Finally, metabolites with significant differences were screened and pathway analysis was conducted. Results: The area under the curve (AUC), sensitivity, and specificity of LASSO, RF and XGBoost test set all reached 1.00. Orthogonal partial least squares discriminant analysis was used to classify the metabolomics data. By combining the results of the univariate false discovery rate and the importance of the variable in projection, we identified 21 significantly different metabolites. Using these 21 metabolites for diagnostic modeling, the AUC, sensitivity, and specificity of LASSO, RF, and XGBoost all reached 1.00. Metabolic pathway analysis revealed that these 21 metabolites are highly correlated with amino acid and lipid metabolisms. On the basis of 21 metabolites, we screened the important variables in the models. Further, five common variables were obtained by intersecting the important variables of three models. Based on these five common variables, the AUC, sensitivity, and specificity of LASSO, RF, and XGBoost all reached 1.00.2-Hydroxypalmitic acid, L-carnitine and cyclic AMP were found to be potential targets and specific biomarkers for pSS. Conclusion: The combination of machine learning and metabolomics can accurately distinguish between patients with pSS and HCs. 2-Hydroxypalmitic acid, L-carnitine and cyclic AMP were potential targets and biomarkers for pSS.

  • Research Article
  • Cite Count Icon 4
  • 10.1016/j.acra.2023.10.040
Machine Learning Methods Based on CT Features Differentiate G1/G2 From G3 Pancreatic Neuroendocrine Tumors
  • Dec 4, 2023
  • Academic radiology
  • Hai-Yan Chen + 10 more

Machine Learning Methods Based on CT Features Differentiate G1/G2 From G3 Pancreatic Neuroendocrine Tumors

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant