- New
- Discussion
- 10.34133/hds.0415
- Jan 23, 2026
- Health Data Science
- Gregorio Ferreira + 3 more
- Research Article
- 10.34133/hds.0195
- Jan 1, 2026
- Health Data Science
- Hao Xu + 8 more
Background: Nonsuicidal self-injuries (NSSIs) are an important contributing factor to adolescent suicide, and various shared factors influence the risk of both NSSIs and suicide attempts (SAs). Both are important predictors of suicide and are part of a continuum of suicidal behaviors. Further exploration of the relationship between adolescent NSSI and SA may facilitate suicide prevention efforts. Methods: An online survey was conducted among 9,140 participants. Network analysis methods were used to explore expected influence (EI), bridge expected influence (BEI), edge weights, and differences between adolescents that have and have not attempted suicide (NSSI-SA and NSSI-NoSA, respectively). Results: Of the 9,140 participants, 7,030 completed the questionnaire, yielding a participation rate of 76.91%. Participants with at least one NSSI were retained, with 2,496 (35.50%) included in the network analysis. The strongest EI node for both networks was “emotion regulation strategies” (E = 1.389 and 1.393), and that for BEI was “personal distress” (Interpersonal Reactivity Index—personal distress; E = 0.497 and 0.492). Network comparisons revealed significant differences in NSSI 4 (“intentionally hitting walls, tables, and other hard objects”; E(Δ) = −0.384, P < 0.001), significant differences in BEI with regard to “perspective taking” (Interpersonal Reactivity Index—perspective taking; E(Δ) = −0.215, P < 0.001), and significant differences in edge weights between NSSI 4 and NSSI 5 (“intentionally hurting oneself by hitting with a fist, palm, or hard object”; E(Δr) = −0.173, P < 0.001). Conclusions: Our study suggests that interventions in the form of emotion regulation strategies can alleviate symptoms throughout the entire network. Attention should be paid to instances when NSSI 4 and NSSI 5 behaviors co-occur frequently.
- Research Article
- 10.34133/hds.0265
- Jan 1, 2026
- Health Data Science
- Hongling Zhu + 16 more
Background: There is a substantial number of research exploring the application of artificial intelligence (AI) in identifying electrocardiogram (ECG) abnormalities related to heart rhythm or conduction with the 12-channel format. However, there is a scarcity of studies focusing on refined differentiation of serials of ECG abnormalities with wide QRS complexes in a simplified channel format. Methods: We constructed an ECG dataset (standard 10-s, 12-channel format) from adult patients from Tongji Hospital of Huazhong University of Science and Technology, Wuhan, China. This dataset was consisted of 5 kinds of ECG abnormalities with wide QRS complexes in the normal heartbeat (60 to 100 beats per minute) and the normal ECGs. Convolutional neural network was developed to classify these abnormalities. Four-channel (I, II, V1, and V5) and 8-channel (I, II, and V1 to V6) formats, compared to the standard 12-channel format (I, II, III, aVR, aVL, aVF, and V 1 to V 6 ), were chosen as the input channel format of the model. Other unreplicated ECGs from Tongji Hospital (TJ-Test set), annotated by a committee of board-certified cardiologists, served as the test dataset. The F1 score, area under the receiver operating characteristic curve (AUROC), and accuracy were calculated to assess the performance of the model, which were further compared with diagnoses of 6 ECG cardiologists who were informed that the final objective was classifying among 6 classes with the 12-channel format. In addition, a dataset of 291 ECGs from The First People’s Hospital of Jiangxia District (JX-Test set) and a public dataset of 64 ECGs were used to assess model generalizability Results: The dataset consisted of 11,808 ECGs from 8,542 patients from 2012 January 1 to 2020 November 30 and divided into training and validation datasets in the ratio of 9:1. The test dataset included unreplicated 480 ECGs from 480 new adult patients recorded from 2014 January 1 to 2017 November 30. The model shows a superior performance in the 8-channel format compared to that of 4- and 12-channel formats. As for the 8-channel format, the model obtained an accuracy of 95.0%, a mean F1 score of 0.969 (0.943 to 0.997), and a mean AUROC score of 0.997 (0.975 to 1.00) compared to an accuracy of 89.9%, an F1 score of 0.898 (0.863 to 0.932), and an AUROC score of 0.941 (0.918 to 0.963) of physicians assessing the same datasets. The model exhibited a mean F1 score of 0.917 (0.943 to 0.997) and a mean AUROC score of 0.994 (0.975 to 1.00) on the JX-Test set, and mean F1 scores of 0.708 for the left bundle branch block and 0.828 for the right bundle branch block for the external published validation data both with the 8-channel format. Conclusion: Our model distinguishes a range of distinct abnormalities focusing on abnormal morphology on QRS complexes in normal heartbeats with high accuracy, providing a foundation for AI-aided clinical decision-support systems in ECG differential diagnosis.
- Research Article
- 10.34133/hds.0414
- Dec 28, 2025
- Health Data Science
- M Vijayasimha
- Research Article
- 10.34133/hds.0409
- Dec 22, 2025
- Health Data Science
- Tianqi Chang + 5 more
Background: Cardiovascular diseases (CVDs) continue to be the leading cause of morbidity and mortality globally, indicating a major global health burden. Glycosylation, one of the key posttranslational modifications of proteins, plays an important role in the onset and progression of CVDs. This study employed bibliometric analysis to examine the research on glycosylation and CVDs, aiming to identify the evolution and hotspots in this field. Methods: A total of 1,441 publications published from 2010 January 1 to 2024 December 31 were extracted from the Web of Science Core Collection. The analysis included a visual and descriptive examination of publication trends, countries/regions, institutions, keywords, and references. Results: The United States is the most productive country/region in this field, followed closely by China. The University of Alabama at Birmingham has made the most important contribution to this area. Key research hotspots include “O-GlcNAcylation”, “biomarkers”, “angiogenesis”, “α-dystroglycan”, “potassium channel”, “heart failure”, “gene expression”, “glycosylation”, and “cardiac glycosides”. Conclusion: Research on glycosylation in CVDs has shown a steady increase in recent years. Among these studies, O-GlcNAcylation plays a pivotal role in this field. This comprehensive bibliometric analysis of glycosylation and CVDs provides researchers with valuable, objective insights to support future investigations.
- Research Article
- 10.34133/hds.0295
- Dec 12, 2025
- Health Data Science
- Abdelghani Halimi + 4 more
Background: Accurate mortality prediction for liver transplant candidates with hepatocellular carcinoma (HCC) remains a critical challenge. Traditional scoring systems, including Child–Pugh, Albumin–Bilirubin, Model for End-Stage Liver Disease (MELD), MELD-Na, MELD 3.0, and Alpha-fetoprotein scores, are widely used but often fail to provide precise risk assessments. This limitation arises from the dual burden of liver dysfunction and tumor progression, which complicates prognosis. Consequently, there is a need for a comprehensive approach addressing both considerations to better manage HCC patients. Methods: We propose an advanced machine learning-based scoring system exploiting Ensemble Learning and SHapley Additive exPlanations (SHAP) for a better understanding of key mortality risk factors. SHAP offers valuable insights into the decision-making process by providing both global and local explanations. By embedding SHAP values in the Uniform Manifold Approximation and Projection space, we perform supervised clustering to infer latent subgroups, providing a higher granularity on the contribution of key variables for mortality risk assessment. Results: Our system based on LightGBM outperforms conventional scores leveraging only 8 relevant variables selected by SHAP analysis. These variables respond to the challenging dual risk problem set in this work. With supervised clustering, we uncover 7 subgroups showing an increasing mortality risk level and a fine assessment of risk factors’ contribution. Conclusion: By contrast to existing studies, our approach offers an integrative data-driven framework for handling the dual risk challenge set by HCC patients with liver dysfunction. Also, it provides a valuable tool for a more precise risk evaluation that may guide treatment decisions and help monitoring patient progression.
- Research Article
- 10.34133/hds.0377
- Oct 13, 2025
- Health Data Science
- Xinyi Liu + 8 more
Background: China has the largest population with Alzheimer’s disease and related dementias (ADRDs) globally, and rapid population aging is expected to drive a substantial increase in cases. This study projects ADRD prevalence and associated economic burdens across provinces in China from 2025 to 2060. Methods: Using data from the China Health and Retirement Longitudinal Study (CHARLS) supplemented by national demographic and provincial statistics, we projected the prevalence and care costs of ADRD for each of the 31 provinces in China from 2025 to 2060. Cost projections included formal care expenses and informal caregiving valued through replacement cost methods. We conducted uncertainty analysis to provide robust estimates for ADRD prevalence and costs. Results: By 2060, ADRD cases in China are projected to reach approximately 49.89 million, with the highest prevalence and economic burden concentrated in provinces such as Shandong, Sichuan, Jiangsu, Henan, and Guangdong. Formal care costs alone are expected to exceed $1 trillion annually, while the total economic value—including informal caregiving—could surpass $5 trillion. Geographic disparities highlight that Eastern and Central regions, with a higher proportions of older adults, will bear disproportionate costs. Informal caregiving is projected to constitute 60% to 80% of total ADRD-related costs. Conclusion: China faces an unprecedented rise in ADRD-related economic burden over the next 4 decades, with substantial regional disparities. Strengthening long-term care infrastructure, expanding financial and social support for caregivers, and implementing regionally tailored healthy aging policies are essential to ensuring equitable and sustainable ADRD care across China.
- Discussion
- 10.34133/hds.0339
- Jul 21, 2025
- Health Data Science
- Jianrong Zhang
- Research Article
- 10.34133/hds.0322
- Jun 18, 2025
- Health Data Science
- Yiming Tao + 5 more
Background: The traditional manual literature screening approach is limited by its time-consuming nature and high labor costs. A pressing issue is how to leverage large language models to enhance the efficiency and quality of evidence-based evaluations of drug efficacy and safety. Methods: This study utilized a manually curated reference literature database—comprising vaccine, hypoglycemic agent, and antidepressant evaluation studies—previously developed by our team through conventional systematic review methods. This validated database served as the gold standard for the development and optimization of LitAutoScreener. Following the PICOS (Population, Intervention, Comparison, Outcomes, Study Design) principles, a chain-of-thought reasoning approach with few-shot learning prompts was implemented to develop the screening algorithm. We subsequently evaluated the performance of LitAutoScreener using 2 independent validation cohorts, assessing both classification accuracy and processing efficiency. Results: For respiratory syncytial virus vaccine safety validation title–abstract screening, our tools based on GPT (GPT-4o), Kimi (moonshot-v1-128k), and DeepSeek (deepseek-chat 2.5) demonstrated high accuracy in inclusion/exclusion decisions (99.38%, 98.94%, and 98.85%, respectively). Recall rates were 100.00%, 99.13%, and 98.26%, with statistically significant performance differences (χ2 = 5.99, P = 0.048), where GPT outperformed the other models. Exclusion reason concordance rates were 98.85%, 94.79%, and 96.47% (χ2 = 30.22, P < 0.001). In full-text screening, all models maintained perfect recall (100.00%), with accuracies of 100.00% (GPT), 100.00% (Kimi), and 99.45% (DeepSeek). Processing times averaged 1 to 5 s per article for title–abstract screening and 60 s for full-text processing (including PDF preprocessing). Conclusions: LitAutoScreener offers a new approach for efficient literature screening in drug intervention studies, achieving high accuracy and significantly improving screening efficiency.
- Research Article
- 10.34133/hds.0325
- Jun 17, 2025
- Health Data Science
- Jingyu Wang + 4 more
Background: Recently, several cutting-edge experimental studies have directed chimeric antigen receptor (CAR)-T therapies toward specific renal diseases, revealing substantial renal benefits. Prior to widespread implementation of these animal experiments and potentially clinical trials, it is crucial to assess the renal safety of CAR-T therapies using real-world safety evidence. Methods: Our focus was on utilizing 4 algorithms, including disproportionality analysis, based on the US Food and Drug Administration Adverse Event Reporting System database, to filter positive signals of acute and chronic renal injury associated with 6 CAR-T therapies. Further determination of causality was achieved through Mendelian randomization (MR) for drugs associated with renal injury events showing a correlation. Results: Six therapies were evaluated involving a total of 9,770 patients, with only acute kidney injury (AKI) identified as associated with idecabtagene vicleucel treatment using 4 algorithmic thresholds, including disproportionality analysis. Subsequently, MR revealed no causal relationship between the idecabtagene vicleucel target B cell maturation antigen and the risk of AKI (P = 0.576), a finding validated in another independent dataset (P = 0.734). Conclusion: CAR-T therapies do not directly cause renal damage and necessitate controlling adverse renal risks during or after treatment, such as cytokine release syndrome. Future research efforts should rigorously optimize these aspects to better cater to nephrologists’ requirements.