Published in last 50 years
Related Topics
Articles published on Test Set
- New
- Research Article
- 10.1186/s13244-025-02131-1
- Nov 9, 2025
- Insights into Imaging
- Chia-Jung Liu + 10 more
Abstract Objectives Differentiating between nontuberculous mycobacteria (NTM) pulmonary disease (NTM-PD) and colonization (NTM-PC) is clinically important but difficult. It remains unknown whether artificial intelligence utilizing clinical data and chest CT images could address this clinical problem. Materials and methods Patients were retrospectively recruited with NTM isolation from respiratory specimens in two hospitals. Their disease or colonization status was determined by three NTM experts. We developed a multimodal deep learning model named NTMNet, which integrates chest CT scans and clinical data (including age, sex, acid-fast smear [AFS] results, and mycobacterial species) to predict NTM disease status. The performance of NTMNet was evaluated on both internal and external test sets. Results A total of 324 NTM-PC patients and 285 NTM-PD patients were included. Among the internal and external test sets, the area under the receiver operating characteristic curve (AUC) for predicting NTM disease status using CT imaging was 0.73 (95% CI: 0.62–0.82) and 0.78 (95% CI: 0.75–0.83), respectively. When imaging data were integrated with clinical information, our NTMNet model achieved AUC values of 0.85 (95% CI: 0.80–0.93) and 0.82 (95% CI: 0.78–0.89), respectively. Furthermore, our NTMNet model demonstrated comparable accuracy to that of three experienced pulmonologists in determining NTM disease status in the reader study. Conclusion Our multimodal NTMNet exhibited satisfactory performance in distinguishing disease status among patients with respiratory NTM isolates. This deep learning-based model has the potential to assist physicians in clinical management, achieving diagnostic accuracy comparable to that of pulmonologists. Critical relevance statement A deep learning model leveraging chest computed tomography images and clinical data effectively differentiated NTM disease status, achieving a classification accuracy comparable to that of pulmonologists and demonstrating its potential to support accurate NTM diagnosis in clinical settings. Key Points Accurately distinguishing nontuberculous mycobacteria (NTM) disease status is clinically important but challenging. The NTMNet model effectively differentiated the NTM disease status and matched the performance of the pulmonologists. The NTMNet model could be a potential diagnostic tool for patients with respiratory NTM isolates. Graphical Abstract
- New
- Research Article
- 10.1186/s12885-025-14886-3
- Nov 8, 2025
- BMC cancer
- Yang Yu + 1 more
Identifying fatty acid metabolism (FAM)-related molecular signatures to construct a prognostic model for multiple myeloma (MM) patients. Transcriptomic profiles and clinical data from MM patients were retrieved from GEO and MMRF databases. FAM-related genes were screened by WGCNA, and one-way cox analysis was performed to identify genes associated with survival. LASSO regression analysis was then performed to construct FAM-related gene characteristics and risk scores. A clinical nomogram incorporating risk scores was developed. Immune microenvironment analysis (CIBERSORT) and functional enrichment (GO/KEGG/GSVA) were performed to characterize risk groups. Quantitative PCR validated hub gene expression in bone marrow mononuclear cells (BMMCs) from 10 newly diagnosed MM patients and 10 healthy donors. In vitro functional assays (CCK-8 proliferation, flow cytometry cell cycle analysis) assessed the impact of CCNA2/KIF11/NUSAP1 knockdown in MM cell lines. We identified 37 prognostic FAM-related genes (FMGs). Among them, 16 genes were used to construct LASSO regression models. KM analysis showed that high-risk patients had poorer prognosis (training set: P < 0.001; test set: P < 0.05). The area under the ROC curve was 0.787. Immunoscape analysis showed that high-risk patients had an immunosuppressive microenvironment. Functional enrichment studies confirmed that high-risk patients had increased abnormalities in cell cycle, aging and metabolic processes. The qRT-PCR analysis revealed CCNA2, KIF11, and NUSAP1 up-regulated in MM patients. CCNA2, KIF11, and NUSAP1 knockdown significantly caused cell cycle arrest and decreased proliferation ability of MM cells. We identified 37 survival-associated FMGs in MM patients, and verified the effects of CCNA2, KIF11, and NUSAP1 on the cell cycle and proliferation of MM cells. Our results also suggest that survival-associated traits based on these genes are potentially robust prognostic biomarkers for MM patients.
- New
- Research Article
- 10.1088/1361-6560/ae1803
- Nov 7, 2025
- Physics in Medicine & Biology
- Lina Mekki + 2 more
Objective.To develop and evaluate a deep reinforcement learning (RL) framework for rapid and automatic machine parameter optimization of volumetric modulated arc therapy (VMAT) treatment plans for localized prostate cancer.Approach.A multi-task policy network combining convolution and long short-term memory was trained to sequentially predict the set of actions on the dose rate and multi-leaf collimator positions over the range of two arcs. The network uses as input the cumulative dose grid at the current gantry angle, contours of the planning target volume (PTV) and organs at risk, and the set of machine parameters at all preceding gantry angles. The method was evaluated on a set of 15 localized prostate cancer patients for a prescription dose of 60 Gy in 20 fractions. For each case, the final state dose distribution was compared against clinical plans. For seamless integration with the clinical workflow, the proposed model was integrated into a clinical treatment planning system (TPS), enabling dosimetric review and final plan adjustments.Main results.The RL framework produced deliverable dual-arc VMAT plans in an average of 20.7 ± 5.0 s over the test set. Dosimetric comparison to clinical plans showed no statistically significant differences for the mean rectum dose as well as for the bladder V6160 Gy, indicating that the RL model was as efficient in sparing these structures as human planners. While the approach showed limitations in terms of PTV coverage and maximum body dose, our proposed integration to TPS showed the RL plans could be automatically refined to clinical quality in an additional 83.8 ± 7.2 s.Significance.The accuracy and fast run time of the approach show the potential of the framework to significantly streamline VMAT treatment planning and enable adaptive radiation therapy.
- New
- Research Article
- 10.1186/s12885-025-15034-7
- Nov 7, 2025
- BMC cancer
- Qi Cai + 12 more
Acute promyelocytic leukemia (APL), a high-risk subtype of acute myeloid leukemia, necessitates rapid diagnosis upon hospital admission to mitigate early mortality. Current diagnosing approaches relying on time-consuming genetic testing or morphological expertise are particularly challenging in resource-limited settings. Herein, this study introduces a novel machine learning approach leveraging routine lab data to enable immediate APL suspicion, offering a new diagnostic possibility for under-resourced hospitals. We developed a two-stage machine learning model using multi-center retrospective data. The cohort included 94 confirmed APL patients (2020-2024) from three tertiary hospitals, with an external validation set (n = 541) from an independent center. Using four VGG-16 networks, we extracted APL-specific 3D scatterplot features from DIFF and WNB channels of routine blood tests. These features were then fed into an optimized random forest classifier-scatterplot (RFC-S) model, refined via recursive feature elimination and threshold tuning. The RFC-S model achieved near-perfect discrimination, with an AUC of 0.9893 in the test set and 0.9979 in external validation. It maintained 98.15% sensitivity and 95.52% specificity-outperforming conventional methods. SHAP analysis confirmed that key scattergram-derived features (e.g., N_APL_Ratio_YZ) drove predictions. Critically, the model requires no additional tests, making it deployable even in low-resource clinics. The RFC-S model represents an innovative approach to APL screening by combining deep learning-derived scattergram features with routine blood parameters. This two-stage methodology achieves high diagnostic accuracy (AUC > 0.98) while maintaining computational efficiency. Importantly, the model's ability to utilize existing laboratory data without requiring additional tests makes it particularly valuable for resource-constrained settings where access to genetic testing or hematological expertise may be limited. Our findings suggest this approach could serve as a practical tool for early APL identification, potentially reducing diagnostic delays in diverse clinical environments.
- New
- Research Article
- 10.1002/jcu.70129
- Nov 7, 2025
- Journal of clinical ultrasound : JCU
- Kemal Panc + 5 more
Accurate grading of prostate cancer is critical for treatment strategies and risk stratification. This study aims to develop a machine learning (ML) model integrating Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) pharmacokinetic parameters with Prostate-Specific Antigen (PSA) values to predict ISUP grade metastatic risk groups. This retrospective study included 102 patients with histologically confirmed prostate cancer. DCE-MRI pharmacokinetic parameters (Ktrans, Kep, Ve, CER, MaxSlope, IAUGC) were standardized. The dataset was balanced using the Synthetic Minority Oversampling Technique and split into training, validation, and test sets. ML models, including Random Forest, were evaluated using Area Under the Curve (AUC) values. The Random Forest classifier achieved the highest performance, with an AUC of 0.92. Precision-recall analysis identified an optimal threshold of 0.3, balancing sensitivity and specificity for high-risk group detection. SHAP analysis highlighted PSA, MaxSlope, and Kep as key predictors contributing to model accuracy. Integrating DCE-MRI parameters with PSA values using ML algorithms enhances the prediction of ISUP grade metastatic risk groups. This method provides a robust tool for metastasis screening and personalized treatment in prostate cancer.
- New
- Research Article
- 10.1021/acs.jctc.5c01222
- Nov 7, 2025
- Journal of chemical theory and computation
- Lukas Hasecke + 1 more
In this contribution we investigate how far multicomponent density functional theory (DFT) results can be improved by the admixture of Møller-Plesset (MP) perturbation theory electron-proton correlation energies. Three formulations are explored, based off the popular double-hybrid functionals B2PLYP, DSD-PBEP86 and PBEQIDH. Partial use of the PA23 proton binding affinities data set is made to parametrize the ratio in the DFT/MP2 correlation energies. The resulting models are evaluated on a separate set of titratable molecules. The combination of nuclear electronic orbital (NEO) DFT and MP2 electron-proton correlation leads up to a 2-fold reduction in the root-mean-square deviation (RMSD) compared to standard NEO-DFT, a trend that is confirmed in the independent test set. We apply the parametrized NEO-B2PLYP model to compute the energetics of protonated water hexamers as well as a challenging example for proton dynamics, a crown ether molecule. In the latter case we compare the energetics of localized vs shared proton configurations. Overall, a ratio of about 0.8:0.2 (DFT/MP2) in the electron-proton correlation delivers a robust improvement across the models, even with variations in the basis sets used and the type of chemical bonds investigated.
- New
- Research Article
- 10.1097/md.0000000000045703
- Nov 7, 2025
- Medicine
- Mengru Li + 4 more
Endometriosis is a long-term health problem that affects a significant number of women globally. Among the various forms of endometriosis, ovarian endometriosis (OEM) is the most prevalent. This research aimed to investigate the factors contributing to the recurrence of OEM after laparoscopic conservative surgery and develop a predictive model utilizing machine learning techniques. The clinical data of 338 patients diagnosed with OEM who underwent laparoscopic conservative surgery at Wuhan University Renmin Hospital between January 2020 and January 2023 were retrospectively analyzed. During a 2-year follow-up period, patients were categorized into either the recurrence group or the non-recurrence group based on the incidence of disease recurrence. Chi-square and Spearman analysis were implemented to identify the factors related to postoperative recurrence in patients with OEM. Statistically significant factors were selected to construct the correlation models. Four algorithms were used in model construction: Random Forest, Gaussian Process, Extreme Gradient Boosting, and Multilayer Perceptron. The primary metric for evaluating model performance was the area under the receiver operating characteristic curve. Sixteen variables were associated with postoperative recurrences. The Gaussian Process had the best predictive power and the area under the receiver operating characteristic curve of the test set was 0.90. The test dataset for the Gaussian Process revealed a sensitivity of 0.75, specificity of 0.90, positive predictive value of 0.46, negative predictive value of 0.97, and accuracy rate of 0.88. The predictive model for the Gaussian Process developed in this study effectively assessed the risk of postoperative recurrence in patients with OEM.
- New
- Research Article
- 10.1371/journal.pone.0333351
- Nov 6, 2025
- PloS one
- Min Woo Kang + 2 more
Infective endocarditis (IE) carries high in-hospital mortality, particularly among intensive care unit (ICU) patients. The predictive role of blood culture positivity in these patients remains unclear. We analyzed 484 adult IE patients from the Medical Information Mart for Intensive Care III (MIMIC-III) database, divided into training (n = 339) and testing (n = 145) cohorts. A suite of demographic, clinical, laboratory, and blood culture variables was used to develop tree-based machine learning models. Random Forest (RF) and Extreme Gradient Boosting (XGB) emerged as top performers and were combined into an ensemble model. SHapley Additive exPlanations (SHAP) quantified variable importance, while the Generative Adversarial Nets for Inference of Individualized Treatment Effects (GANITE) model assessed the average treatment effect (ATE) and conditional treatment effects (CATE) of blood culture positivity on in-hospital mortality across various clinical subgroups. The ensemble model demonstrated robust performance with an area under the receiver operating characteristic curve (AUROC) of 0.826 and an accuracy of 0.821 on the test set. Blood culture positivity consistently ranked among the top predictors of mortality. SHAP analysis revealed that the presence of bacteremia increased the predicted probability of in-hospital mortality. Specifically, the GANITE model estimated that blood culture positivity raised mortality by 0.9% (95% confidence interval [CI]: -0.9% to 2.6%) in the training set, 7.4% (95% CI: 4.3% to 10.4%) in the test set, and 2.8% (95% CI: 1.2% to 4.4%) overall. Furthermore, CATE analysis highlighted that the adverse impact of blood culture positivity was significantly more pronounced in patients aged 60 years and older, those with systolic blood pressure below 100 mmHg, and in certain endocarditis subtypes. Blood culture positivity at ICU admission is associated with a modest yet clinically significant increase in in-hospital mortality among IE patients. The application of advanced machine learning and causal inference models enhances risk stratification and may inform more targeted clinical interventions in this high-risk group.
- New
- Research Article
- 10.3389/fphar.2025.1683708
- Nov 6, 2025
- Frontiers in Pharmacology
- Lian Li + 8 more
Objective This study utilizes real-world data from primary membranous nephropathy (PMN) patients to preliminarily develop a venous thromboembolism (VTE) risk prediction model with machine learning. The aim is to improve the rational use of prophylactic anticoagulant therapy by predicting VTE risk in these patients. Methods We collected diagnostic and treatment data for PMN patients hospitalized at Sichuan Provincial People’s Hospital from 1 January 2018, to 30 September 2024. The data was divided into training and test sets at an 8:2 ratio, followed by processed using combinations of three imputation methods, three sampling methods, and three feature selection methods. After preprocessing, fourteen machine learning algorithms were employed to develop a predictive model for VTE risk in PMN patients. The SHapley Additive exPlanation (SHAP) method was used to interpret the contribution of outcome features. Finally, a VTE risk prediction tool for PMN patients was constructed using Streamlit. Results A total of 643 patients with PMN were included in the study, of whom 93 developed VTE. Among the 504 models constructed, the NGBoost model, which incorporated imputation by K-Nearest Neighbor, sampling by Borderline-SMOTE, and feature selection by Frequency-based Selection, was identified as the optimal model, achieving an area under the curve (AUC) of 0.911. The optimal model included ten features: D-dimer (DD), Fibrin Degradation Products (FDP)&gt;5 mg/L, international normalized ratio (INR) of prothrombin, Recurrent nephrotic syndrome (RNS), cholinesterase (CHE), Urinary Microalbumin to Creatinine Ratio (umALB/Ucr), statins, antithrombin III (AT III) activity, albumin, and anti-phospholipase A2 receptor antibody (aPLA2Rab). Finally, an online predictive tool based on the optimal model was developed to provide real-time individualized VTE risk predictions for PMN patients. Conclusion This study developed a personalized risk prediction model for VTE in PMN patients using machine learning techniques. Additionally, a web-based tool for this predictive model was created. The model demonstrates strong predictive performance and can assist in clinical decision-making for the prevention and treatment of VTE in PMN patients.
- New
- Research Article
- 10.17219/acem/202947
- Nov 6, 2025
- Advances in clinical and experimental medicine : official organ Wroclaw Medical University
- Heng Zhang + 5 more
Breast cancer (BC) is now the most common malignancy in women. Early detection and precise diagnosis are essential for improving survival. To develop an integrated computer-aided diagnosis (CAD) system that automatically detects, segments and classifies lesions in mammographic images, thereby aiding BC diagnosis. We adopted YOLOv5 as the object-detection backbone and used the Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM). Data augmentation (random rotations, crops and flips) increased the dataset to 5,801 images, which were randomly split into training, validation and test sets (7 : 2 : 1). Lesion-classification performance was evaluated with the area under the receiver operating characteristic (ROC) curve (AUC), precision, recall, and mean average precision at a 0.5 confidence threshold (mAP@0.5). The CAD system yielded an mAP@0.5 of 0.417 and an F1-score of 0.46 for lesion detection, achieved an AUC of 0.90 for distinguishing benign from malignant lesions, and processed images at 65 fps. The integrated CAD system combines rapid detection and classification with high accuracy, underscoring its strong clinical value.
- New
- Research Article
- 10.1007/s10278-025-01732-y
- Nov 6, 2025
- Journal of imaging informatics in medicine
- Guangyu Wei + 2 more
Precise segmentation of continuous vessels in X-ray coronary angiography (XCA) image sequences is pivotal for improving the diagnosis and treatment of coronary artery disease. However, motion artifacts and shadowing in XCA images significantly complicate the segmentation of accurate vessel segmentation. To address these challenges, we propose FlowVM-Net, a dynamic information-enhanced encoder-decoder architecture. The model incorporates an optical flow generation module to create temporal information across image sequences. Additionally, we introduce a wavelet dilated convolution visual state space model block based on VMamba as a fundamental component of the encoder-decoder structure. An attention-based optical flow feature fusion module is designed to effectively integrate sequential spatial features and temporal information. Furthermore, a composite loss function, including boundary difference over union loss, is employed to enhance the accuracy of vessel edge and thin vessel segmentation. We evaluate FlowVM-Net on a dataset of 542 samples, achieving a DSC of 85.17%, a sensitivity of 85.15%, and a quality score of 90.49% on the test set. The proposed network effectively preserves vessel continuity, accurately segments thin vessels, and demonstrates the potential of leveraging dynamic context from XCA sequences for improved coronary artery segmentation. The code for this project is available at: https://github.com/wgyhhhh/FlowVM-Net .
- New
- Research Article
- 10.3390/ma18215054
- Nov 6, 2025
- Materials
- Zhe Yang + 5 more
Line heating processes play a significant role in the fabrication of structural steel components, particularly in industries such as shipbuilding, aerospace, and automotive manufacturing, where dimensional accuracy and minimal defects are critical. Traditional methods, such as the finite element method (FEM) simulations, offer high-fidelity predictions but are hindered by prohibitive computational latency and the need for case-specific re-meshing. This study presents a physics-aware, data-driven neural network that delivers fast, high-fidelity temperature predictions across a broad operating envelope. Each spatiotemporal point is mapped to a one-dimensional feature vector. This vector encodes thermophysical properties, boundary influence factors, heatsource variables, and timing variables. All geometric features are expressed in a path-aligned local coordinate frame, and the inputs are appropriately normalized and nondimensionalized. A lightweight multilayer perceptron (MLP) is trained on FEM-generated induction heating data for steel plates with varying thickness and randomized paths. On a hold-out test set, the model achieves MAE = 0.60 °C, RMSE = 1.27 °C, and R2 = 0.995, with a narrow bootstrapped 99.7% error interval (−0.203 to −0.063 °C). Two independent experiments on an integrated heating and mechanical rolling forming (IHMRF) platform show strong agreement with thermocouple measurements and demonstrate generalization to a plate size not seen during training. Inference is approximately five orders of magnitude (~105) faster than FEM, enabling near-real-time full-field reconstructions or targeted spatiotemporal queries. The approach supports rapid parameter optimization and advances intelligent line heating operations.
- New
- Research Article
- 10.1186/s13020-025-01246-3
- Nov 6, 2025
- Chinese medicine
- Yilin Wang + 8 more
To develop and validate a panel of serum IgG N-glycan biomarkers for both the diagnosis of rheumatoid arthritis (RA) and the differentiation of Traditional Chinese Medicine (TCM) syndromes in RA patients. We conducted a case-control study involving 105 patients meeting the 2010 American College of Rheumatology/European Alliance against Rheumatism RA classification criteria and 79 healthy controls. RA patients were classified according to TCM principles into cold and heat patterns. Serum IgG was enriched using titanium dioxide-porous graphitic carbon (TiO2-PGC) wafers and analyzed by high-performance liquid chromatography. IgG N-glycans were quantified using multiple reaction monitoring. Potential N-glycan biomarkers for RA diagnosis and TCM syndrome differentiation were identified and validated using multivariate data analysis. Orthogonal partial least squares discriminant analysis (OPLS-DA) identified 57 N-glycans (variable importance in projection > 1) that differentiated between RA cold pattern, heat pattern, and healthy controls. Through random forest machine learning and Kruskal-Wallis testing, we identified three acidic N-glycans (5_4_0_1-a, 5_4_0_2-a, and 5_4_0_2-b) as potential diagnostic biomarkers. In the training set, receiver operating characteristic analysis demonstrated that this three-N-glycan panel effectively distinguished RA patients from healthy controls (AUC 0.90), with particularly strong discrimination between RA heat pattern and healthy controls (AUC 0.99) and between RA cold pattern and healthy controls (AUC 0.84). The robust predictive performance was further validated in an independent test set. Additionally, we developed a logistic regression model for future clinical application in predicting both RA diagnosis and its heat/cold syndrome patterns. This glycomics-based approach identified and validated novel N-glycan biomarkers associated with both RA diagnosis and TCM syndrome differentiation. The combination of these N-glycan biomarkers and our diagnostic model offers a promising strategy for integrating modern diagnostic techniques with TCM classification in RA management.
- New
- Research Article
- 10.1186/s12903-025-07091-y
- Nov 5, 2025
- BMC oral health
- Shijie Zhou + 6 more
This study aimed to identify key prognostic variables and to develop and validate a clinical prediction model for pre-treatment assessment of tongue crib applicability. This retrospective study included 128 cases with anterior crossbite treated with tongue crib in mixed dentition. The total samples were categorized into applicable (n = 80, corrected within 6 months without relapse) and non-applicable (n = 48) groups. Cephalometric parameters were measured using Dolphin Imaging 11.8, with Python (3.9.12) for statistical analysis.Randomly select 80% as the training set and the remaining 20% as the testing set with 100 iterations to establish logistic regression models incorporating SNB-ANB, Wits, and APDI sagittal parameters as predictors. Three predictive equations P = Exp(L)/1 + Exp(L), with a critical score of 0.5 underwent apparent validation on the training set and internal validation performed on the test set. The model demonstrating optimal validity and accuracy was selected to guide the clinical application. As for the total prediction accuracy ofapparent validation on the training setandinternal validation performed on the test set, SNB-ANB model was 79.9% and 77.0%; Wits model was 80.3% and 78.9%; APDI model was the highest, which was 81.1% and 80.5%. When the three prediction models changed fromapparent validation on the training settointernal validation performed on the test set, the total prediction accuracy decreased slightly (-0.6% ~ -2.9%) with the APDI model exhibiting superior stability. The APDI simplified prediction model expression was P = Exp(L)/1+ Exp(L), L = -0.305(APDI)-0.341(MP-FH)-0.263(Co-Go)+50.496. "P" was the probability predicted to be applicable for tongue crib treatment, with a critical score of 0.5 (that was, p >0.5 was applicable, p < 0.5 was non-applicable). Applying the prediction model was able to effectively predict the results of anterior crossbite in mixed dentition treated with tongue crib.
- New
- Research Article
- 10.3389/frai.2025.1673148
- Nov 5, 2025
- Frontiers in Artificial Intelligence
- Abhijai Sasikumar + 2 more
In Formula 1, which is among the most competitive motorsports in the world, the timing of a pit stop can make the difference between winning and losing a race. Conventional methods based on human judgment can be erratic, especially in rapidly changing race conditions. This work proposes a datadriven framework based on deep learning models to predict optimal pit stop timings using raw telemetry data extracted from FastF1 API. To improve the robustness of the models, advanced preprocessing techniques such as normalization, imputation, and class balancing with Synthetic Minority Over-sampling Technique (SMOTE) were implemented. Five different deep learning architectures, including Bi-LSTM, TCN-GRU, GRU, InceptionTime, and CNN-BiLSTM, were trained and evaluated employing precision, recall, and F1-score as metrics. Of these, the Bi-LSTM model achieved the overall best performance which can be explained by its capability to model long-range dependencies in both forward and backward temporal directions. The Bi-LSTM achieved a precision of 0.77, recall of 0.86, and an F1-score of 0.81 on the test set, demonstrating strong predictive accuracy under real-race conditions. Additionally, a historical race visualization interface was developed to visualize the model's predictions.
- New
- Research Article
- 10.1148/ryai.240786
- Nov 5, 2025
- Radiology. Artificial intelligence
- Manli Wu + 25 more
Purpose To develop a multimodality deep learning model (Ovarian Cancer Network, OCNet) using dynamic contrast-enhanced US (CEUS) images for classifying adnexal lesions. Materials and Methods This retrospective study included patients with pathologically confirmed adnexal lesions detected on US across 14 hospitals in China between January 2018 and July 2023. Data were divided into the training set (n = 275), internal testing set (n = 57), and external testing set (n = 63). Two deep learning models (OCNetmanual and OCNetautomated) were developed and compared with Ovarian-Adnexal Reporting and Data System (O-RADS) US and Assessment of Different NEoplasia's in the adnexa (ADNEX) model. Diagnostic performances of radiologists with and without assistance of OCNet were also assessed. Results A total of 395 female patients (median age, 43 years [IQR, 31-55]) were included (252 benign and 143 malignant). OCNetmanual and OCNetautomated achieved an area under the receiver operating characteristic curve (AUC) of 0.94 (95% CI: 0.89, >0.99) and 0.91 (95% CI: 0.83, 0.99), respectively, outperforming O-RADS US (AUC: 0.79; 95% CI: 0.68, 0.89; P = .002 and P = .03) and ADNEX model (AUC: 0.86; 95% CI: 0.77, 0.95; P = .04 and P = .36). Additionally, the assistance of OCNet enhanced diagnostic performance for junior radiologists, improving the average of AUC from 0.86 to 0.94 and the average specificity from 52 to 73%. Conclusion The OCNet model achieved higher performance than O-RADS US and the ADNEX model for classifying adnexal lesions and improved diagnostic performance of junior radiologists. ©RSNA, 2025.
- New
- Research Article
- 10.2196/68558
- Nov 5, 2025
- JMIR medical informatics
- Areej Alhassan + 4 more
Extracting genetic phenotype mentions from clinical reports and normalizing them to standardized concepts within the human phenotype ontology are essential for consistent interpretation and representation of genetic conditions. This is particularly important in fields such as dysmorphology and plays a key role in advancing personalized health care. However, modern clinical named entity recognition methods face challenges in accurately identifying discontinuous mentions (ie, entity spans that are interrupted by unrelated words), which can be found in these clinical reports. This study aims to develop a system that can accurately extract and normalize genetic phenotypes, specifically from physical examination reports related to dysmorphology assessment. These mentions appear in both continuous and discontinuous lexical forms, with a focus on addressing challenging discontinuous entity spans. We introduce DiscHPO, a 2-phase pipeline consisting of a sequence-to-sequence named entity recognition model for span extraction, and an entity normalizer that uses a sentence transformer biencoder for candidate generation and a cross-encoder reranker for selecting the best candidate as the normalized concept. This system was tested as part of our participation in Track 3 of the BioCreative VIII shared task. For overall performance on the test set, the top-performing model for entity normalization achieved an F1-score of 0.723, while the best span extraction model reached an F1-score of 0.665. Both scores surpassed those of 2 baseline models using the same dataset, indicating superior efficacy in handling both continuous and discontinuous spans. On the validation set, we were able to demonstrate our system's ability to recognize these mentions, with the model achieving an F1-score of 0.631 for exact match on discontinuous spans only. The findings suggest that exact extraction of entity spans may not always be necessary for successful normalization. Partial mention matches can be sufficient as long as they capture the essential concept information, supporting the system's utility in clinical downstream tasks.
- New
- Research Article
- 10.3389/fmed.2025.1699842
- Nov 5, 2025
- Frontiers in Medicine
- Yi-Xiang Zhang + 7 more
Background Postoperative sleep disturbance (PSD) is a common complication following total knee arthroplasty (TKA), which negatively impacts patient recovery. Despite the critical need for early detection and management, there is limited research on predictive models for early PSD, particularly those integrating machine learning (ML) techniques. Objective This study aimed to develop a predictive model for early PSD following TKA using ML algorithms, identify key predictive factors, and provide an interpretable model to guide clinical decision-making. Methods The study included 505 patients who underwent TKA. Clinical data were collected at three stages: preoperatively, intraoperatively, and postoperatively. Ten MLa models, including logistic regression, support vector machine (SVM), and XGBoost, were trained and evaluated using a test set. Performance metrics, including accuracy, sensitivity, specificity, and area under the curve (AUC), were used to evaluate the efficacy of the models. Key features influencing PSD were identified through SHapley Additive Explanations (SHAP) analysis to enhance model interpretability. Results Gradient Boosting Machine (GBM) demonstrated the highest AUC (0.906), accuracy (0.834), and sensitivity (0.879), establishing it as the optimal model for predicting PSD. Key predictors identified included age, smoking, living alone, living in the city, VAS 1 month postoperative, and anxiety 1 month postoperative. SHAP analysis revealed that postoperative VAS and age were the most influential factors in predicting PSD, with their impact varying based on individual patient data. Conclusion The study developed a robust and interpretable ML model for the early prediction of PSD following TKA. This model can aid in preoperative risk stratification, facilitating personalized management strategies to improve postoperative outcomes. Further validation in larger cohorts and diverse settings is necessary to enhance its broader clinical applicability.
- New
- Research Article
- 10.63313/aerpc.9057
- Nov 5, 2025
- Advances in Engineering Research Possibilities and Challenges
- Peng Xu
Inverse design based on deep learning offers a revolutionary paradigm for ac-celerating the development of novel terahertz (THz) metamaterials. Howev-er, its application is often constrained by the high cost of acquiring large-scale sim-ulation datasets. This work focuses on the high-accuracy inverse design of THz electromagnetically induced transparency (EIT) metamaterials under small da-taset conditions, systematically evaluating the effectiveness of different deep learning architectures. To achieve precise inversion from a target spectrum to its physical structure, we constructed a representative dataset through parametric electromagnetic si-mu-lation. On this foundation, we built, trained, and compared three neural network models: a Multi-Layer Perceptron (MLP) as the baseline, a residual ful-ly con-nected network (FC-ResNet) for enhanced deep network training, and a one-dimensional convolutional neural network (1D-CNN) designed for se-quen-tial data. The results reveal a key finding: for this inverse design task, the FC-ResNet demonstrated superior predictive performance, achieving a coefficient of de-termination (R²) of 0.9794 on an independent test set. This significantly out-performed the baseline MLP (0.9438) and surprisingly surpassed the theo-reti-cally more suitable 1D-CNN. Further analysis suggests that for the EIT in-verse problem investigated here, the prediction relies more on the global mor-phology and correlation features of the spectrum than on localized characteris-tics. The FC-ResNet, with its deep architecture and effective residual learning mecha-nism, successfully captured this complex, non-local mapping relation-ship. The core contribution of this work is demonstrating that for complex physical prob-lems, a deep, general-purpose model with sufficient expressive power and stable trainability can outperform a more specialized architecture that is prone to in-formation loss. This finding provides a crucial guideline for model selection in the intelligent design of physical devices, particularly in re-source-constrained scenarios
- New
- Research Article
- 10.3390/diagnostics15212801
- Nov 5, 2025
- Diagnostics
- Chun-You Chen + 12 more
Background/Objectives: Computerized diagnostic algorithms could achieve early detection of acute kidney injury (AKI) only with available baseline serum creatinine (SCr). To tackle this weakness, we tried to construct a machine learning model for AKI diagnosis based on point-of-care clinical features regardless of baseline SCr. Methods: Patients with SCr > 1.3 mg/dL were recruited retrospectively from Wan Fang Hospital, Taipei. A Dataset A (n = 2846) was used as the training dataset and a Dataset B (n = 1331) was used as the testing dataset. Point-of-care features, including laboratory data and physical readings, were inputted into machine learning models. The repeated machine learning models randomly used 70% and 30% of Dataset A as training dataset and testing dataset for 1000 rounds, respectively. The single machine learning models used Dataset A as training dataset and Dataset B as testing dataset. A computerized algorithm for AKI diagnosis based on 1.5× increase in SCr and clinician’s AKI diagnosis compared to machine learning models. Results: On an independent, unbalanced test set (n = 1331), our machine learning models achieved AUROC values ranging from 0.67 to 0.74. A pre-existing computerized algorithm performed best (AUROC = 0.94). Crucially, all machine learning models significantly outperformed the routine clinician’s diagnosis (AUROC ~0.74 vs. 0.53, p < 0.05). For context, a pre-existing computerized algorithm, which requires available baseline SCr data, achieved an AUROC of 0.94 on a relevant subset of the data, highlighting the performance benchmark when baseline data is available. Formal statistical comparisons revealed that the top-performing models (e.g., Random Forest, SVM) were often statistically indistinguishable. Model performance was highly dependent on the test scenario, with precision and F1 scores improving markedly on a balanced dataset. Conclusions: In the absence of baseline SCr, machine learning models can diagnose AKI with significantly greater accuracy than routine clinical diagnoses. Our robust statistical analysis suggests that several advanced algorithms achieve a similarly high level of performance.