Articles published on Machine-learning Random Forest Model
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
27 Search results
Sort by Recency
- Research Article
- 10.1007/s11121-026-01888-1
- Mar 3, 2026
- Prevention science : the official journal of the Society for Prevention Research
- Treena Becker + 1 more
Given the conceptual issues involved in defining and measuring recovery and accordingly substance use disorder (SUD) treatment outcomes, the role of each state's treatment system and social factors, the objective is to examine underlying and interrelated patterns within SUD treatment, outcomes, and recovery. Using a recovery-oriented framework, a Machine Learning Random Forest model was developed to analyze publicly funded SUD treatment services across the United States. The aim was to predict the 10 most important features that increase the likelihood of positive treatment outcomes, defined as less substance use (SU) or abstinence. Over 78% of SUD treatment services were provided to individuals either with Medicaid coverage or were uninsured. The most important feature identified was the number of days in treatment, regardless of setting. The second most important feature was the state and whether various treatment services were available. The third and fourth ranked features were the type of treatment at discharge and at admission, respectively. Housing status, SU self-help group participation, and employment were lower ranked. Referral source was the tenth ranked feature. The length of time in SUD treatment is consistent with the clinical perspective of the individual seeking treatment and continuing in care and recovery support. Individuals in Medicaid-funded treatment live in poverty, with peer support and community who have the least resources to support their recovery journey. States that prioritize behavioral health should coordinate to increase the availability of higher-cost, longer-duration treatment services across state lines, to states with low availability.
- Research Article
- 10.1002/hed.70066
- Oct 7, 2025
- Head & neck
- Abdullah A Memon + 19 more
Head and neck squamous cell carcinoma (HNSCC) is an aggressive malignancy, with 50% of patients recurring. A subset of patients experience rapid recurrence (RR) postoperatively but prior to adjuvant therapy. This study identifies factors associated with RR and additional recurrence intervals: short-interval recurrence (SIR) and standard recurrence (SR). Retrospective 10-year review of 246 HNSCC patients undergoing surgery with adjuvant therapy. Recurrence was categorized as RR (prior to initiation of adjuvant therapy), SIR (≤ 6 months post-adjuvant therapy), and SR (> 6 months post-adjuvant therapy). Univariate analysis (UVA), multivariate analysis (MVA), and machine learning Random Forest models were employed to identify predictors of each recurrence interval. Of the 246 patients, 89 recurred (45 SR, 27 SIR, 17 RR). On MVA, skin invasion (OR = 3.492, p = 0.039) was a unique predictor of RR. Random Forest feature importance also revealed skin invasion, along with nodal status, tobacco pack-years, and tumor size as predictors with strong performance (accuracy 93%, AUC 0.96, F1 0.93). Skin invasion is a unique independent predictor of RR, confirmed by two statistical models. These patients warrant further study.
- Research Article
- 10.1016/j.athoracsur.2025.09.009
- Oct 1, 2025
- The Annals of thoracic surgery
- Jonathan Afoke + 13 more
Shifting Paradigms: Exercise Testing as a Metric of Long-Term Success in Surgery for Ebstein Anomaly.
- Research Article
- 10.3390/info16080650
- Jul 30, 2025
- Information
- Paolo Fantozzi + 4 more
Electronic voting allows people to participate more easily in their country’s electoral events. Nevertheless, its adoption is still far from widespread. In this paper, we provide a detailed survey of the state of adoption worldwide and investigate which socio-economic factors may influence such an adoption. Its usage is wider in North and South America, while remaining considerably lower in Europe and Asia and practically absent in Africa. We distinguish between e-voting, which maintains the traditional polling station structure while adding technological components, and i-voting, which enables remote participation from any location using personal devices. Five factors (country’s surface and population, Gross Domestic Product, Internet Usage, and Democracy Index) are investigated to predict adoption, and an accuracy of over 79% is achieved through a machine learning random forest model. Larger, wealthier, and more democratic countries are typically associated with a larger adoption of internet voting.
- Research Article
- 10.3389/fphar.2024.1486346
- Dec 12, 2024
- Frontiers in Pharmacology
- Jiahui Zhang + 3 more
ObjectiveIntra-abdominal candidiasis (IAC) is difficult to predict in elderly septic patients with intra-abdominal infection (IAI). This study aimed to develop and validate a nomogram based on lymphocyte subtyping and clinical factors for the early and rapid prediction of IAC in elderly septic patients.MethodsA prospective cohort study of 284 consecutive elderly patients diagnosed with sepsis and IAI was performed. We assessed the clinical characteristics and parameters of lymphocyte subtyping at the onset of IAI. A machine-learning random forest model was used to select important variables, and multivariate logistic regression was used to analyze the factors influencing IAC. A nomogram model was constructed, and the discrimination, calibration, and clinical effectiveness of the model were verified.ResultsAccording to the results of the random forest and multivariate analyses, gastrointestinal perforation, renal replacement therapy (RRT), T-cell count, CD28+CD8+ T-cell count and CD38+CD8+ T-cell count were independent predictors of IAC. Using the above parameters to establish a nomogram, the area under the curve (AUC) values of the nomogram in the training and testing cohorts were 0.840 (95% CI 0.778-0.902) and 0.783 (95% CI 0.682-0.883), respectively. The AUC in the training cohort was greater than the Candida score [0.840 (95% CI 0.778-0.902) vs. 0.539 (95% CI 0.464-0.615), p< 0.001]. The calibration curve showed good predictive values and observed values of the nomogram; the DCA results showed that the nomogram had high clinical value.ConclusionWe established a nomogram based on the T-cell count, CD28+CD8+ T-cell count, CD38+CD8+ T-cell count and clinical risk factors that can help clinical physicians quickly rule out IAC or identify elderly patients at greater risk for IAC at the onset of infection.Clinical Trial Registration[chictr.org.cn], identifier [ChiCTR2300069020].
- Research Article
1
- 10.5993/ajhb.48.5.12
- Oct 30, 2024
- American Journal of Health Behavior
- Ofra Walter + 2 more
Objectives: We investigated the impact of temporal focus, and emotional and spiritual intelligence on the well- being of emerging adults in Israel's Palestinian minority population in a time of war. Methods: There were 194 Palestinian students enrolled in higher education in Israel who participated in the study. A machine-learning random forest model was employed to explore the interaction of predictors using traditional linear regression and a regression tree. Results: High emotional intelligence and present temporal focus were linked with elevated satisfaction with life. High past temporal focus and older variables were associated with low levels of satisfaction with life. We found no statistically significant differences by time of questionnaire completion (before or after the outbreak of war). Conclusions: For the Palestinian minority in Israel, personal indicators of agency were predictive of life satisfaction, but advent of war had no significant effect on any of these indicators.
- Research Article
4
- 10.1016/j.heliyon.2024.e31643
- May 28, 2024
- Heliyon
- Sang-Hyeon Jin + 9 more
This study analyzed spatiotemporal variation and long-term trends in water quality indicators and trophic state conditions in an Asian temperate reservoir, Juam Reservoir (JR), and developed models that forecast algal chlorophyll (CHL-a) over a period of 30 years, 1993–2022. The analysis revealed that there were longitudinal gradients in water quality indicators along the reservoir, with notable influences from tributaries and seasonal variations in nutrient regimes and suspended solids. The empirical model showed phosphorus was found to be the key determinant of algal biomass, while suspended solids played a significant role in regulating water transparency. The trophic state indices indicated varying levels of trophic status, ranging from mesotrophic to eutrophic. Eutrophic states were particularly observed in zones after the summer monsoons, indicating a heightened risk of algal blooms, which were more prevalent in flood years. The analysis of trophic state index deviation suggested that phosphorus availability strongly influences the reservoir trophic status, with several episodes of non-algal turbidity at each site during Mon. Increases in non-algal turbidity were more prevalent during the monsoon in flood years. This study also highlighted overall long-term trends in certain water quality parameters, albeit with indications of shifting pollution sources towards non-biodegradable organic matter. According to the machine learning tests, a random forest (RF) model strongly predicted CHL-a (R2 = 0.72, p < 0.01), except for algal biomass peaks (>60 μg/L), compared to all other models. Overall, our research suggests that CHL-a and trophic variation are primarily regulated by the monsoon intensity and predicted well by the machine learning RF model.
- Research Article
3
- 10.1002/ehf2.14816
- May 9, 2024
- ESC Heart Failure
- Song Li + 8 more
AimsIt is unclear whether activated partial thromboplastin time (aPTT) or anti‐Xa is more accurate for monitoring heparin anticoagulation in mechanical circulatory support (MCS) patients. This study investigates the relationship between aPTT and anti‐Xa in MCS patients and identifies predictors of discordance.Methods and resultsaPTT and anti‐Xa were simultaneously measured in a prospective cohort of MCS patients receiving unfractionated heparin at a tertiary academic medical centre. Therapeutic aPTT and anti‐Xa levels were 60–100 s and 0.3–0.7 IU/mL, respectively, and concordance was defined as both levels being subtherapeutic, therapeutic, or supratherapeutic. To identify predictors of discordance, both a machine learning random forest model and a multivariate regression model were applied to patient demographics, device type, and 14 laboratory variables; 23 001 pairs of simultaneously measured aPTT/anti‐Xa were collected from 699 MCS patients. aPTT and anti‐Xa were concordant in 35.5% of paired observations and discordant in 64.5% (aPTT > antiXa 61.5%; aPTT < antiXa 3.0%). Discordance with a high aPTT relative to anti‐Xa (aPTT > antiXa) was associated with high INR, eGFR, and total bilirubin, as well as low platelets, haemoglobin, pre‐albumin, white blood cell count, and haptoglobin. Total artificial heart and durable ventricular assist devices were more likely to be associated with aPTT > anti‐Xa than temporary MCS devices.ConclusionsaPTT and anti‐Xa were frequently discordant in MCS patients receiving heparin anticoagulation. Clinical conditions common in MCS patients such as concurrent warfarin use, malnutrition, haemolysis, and thrombocytopenia, as well as durable type of MCS devices were associated with a high aPTT relative to anti‐Xa.
- Research Article
- 10.1158/1538-7445.advbc23-a066
- Feb 1, 2024
- Cancer Research
- Rosalyn W Sayaman + 13 more
Abstract Background: Machine learning (ML) in translational medicine has led to prediction of clinical outcomes and identification of new biomarkers. We employ ML in prediction of pathologic complete response (pCR) in high-risk breast cancer patients in the neoadjuvant I-SPY2 TRIAL where not all novel agents have strong predictive biomarkers. Leveraging a ML approach using progressively expanded candidate genes, we explore the limitations of using only known mechanisms of action in predicting pCR, and the extent to which biology outside known drug action improves response prediction in the first 10 arms of the trial. Methods: ML random forest models were developed in I-SPY2 patients (n=982) with pre-treatment gene expression and pCR data across 10 treatment arms (PMID: 35623341), including inhibitors of HER2: neratinib (N), pertuzumab (P), TDM1/P; AKT (MK-2206); IGF1R (ganitumab); HSP90 (ganetespib); PARP/DNA repair (veliparib/carboplatin, VC); ANG1/2 (trebananib, T); immune checkpoints (PD1-inh); and Control (Ctr). Each HR/HER2 receptor/treatment arm subset (m=27) was evaluated independently. We employed a three-pronged feature-selection approach using (1) genes restricted to known mechanism of action of individual I-SPY2 agents (k=10 to 88 genes); (2) genes expanded to include targeted pathways for all 10 agents/combinations (k=282); and (3) an unbiased whole genome approach (k=17,990). Samples were partitioned with 75% used for training and cross-validation, and 25% held out as test sets. Predictive ML models were defined as those with performance ≥ 0.90 based on different performance metrics (e.g., AUC, sensitivity, specificity). Results: For each of the 27 subtype-treatment subsets, at least one high performing model was identified. In 6 subtype-treatment subsets, mechanism of action genes were sufficient to predict pCR: AKT/PI3K/HER genes in HR+HER2- N and HR-HER2+ P; DNA repair genes in HR+HER2- VC; angiogenesis-associated genes in HR+HER2+ T; and immune-associated genes in both HR+HER2- and HR-HER2- PD1-inh subsets. Expanded targeted pathway models were required to identify predictive models in 8 additional subtype-treatment pairs from the N, T-DM1/P, MK-2206, VC, T, and HER2+ Ctr arms, with significant contribution of DNA repair, immune, and HSP90 genes for multiple arms. A genome-wide approach was required for the remaining 13 subtype-treatment pairs with no previous models from the N, P, MK-2206, ganitumab, ganetespib, T, and HER2- Ctr arms. Even for subtype-treatment pairs where mechanism of action gene sets was sufficient for reasonable models, expanded gene sets resulted in improved performance. For instance, metabolism genes improved model performance for HR-HER2+ in N and Ctr, and for HR+HER2- in the PD1-inh arm; and mitochondrial and protein folding dysfunction genes improved response prediction in HR-HER2- in the ganetespib arm. Conclusion: Our study identifies mechanism of action biomarkers associated with response to each drug and elucidates possible off-target effects contributing to observed drug sensitivity and resistance. Citation Format: Rosalyn W. Sayaman, Denise M. Wolf, Christina Yau, Julia Wulfkhule, Emanuel F. Petricoin, Lamorna Brown-Swigart, Tam Binh Bui, Gillian L. Hirst, Diane Heditsian, W. Fraser Symmans, Angela DeMichele, Mark LaBarge, Laura J. Esserman, Laura van ‘t Veer. Machine learning elucidates biology of response within and outside the mechanisms of action of therapeutic agents in the I-SPY2 breast cancer TRIAL [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Advances in Breast Cancer Research; 2023 Oct 19-22; San Diego, California. Philadelphia (PA): AACR; Cancer Res 2024;84(3 Suppl_1):Abstract nr A066.
- Research Article
13
- 10.1016/j.uclim.2023.101790
- Dec 21, 2023
- Urban Climate
- Mohammad Saleh Ali-Taleshi + 2 more
Meteorologically normalized spatial and temporal variations investigation using a machine learning-random forest model in criteria pollutants across Tehran, Iran
- Research Article
23
- 10.1021/acs.est.3c08076
- Dec 20, 2023
- Environmental Science & Technology
- Yingyu Bao + 5 more
Assessing the impacts of cumulative anthropogenic disturbances on estuarine ecosystem health is challenging. Using spatially distributed sediments from the Pearl River Estuary (PRE) in southern China, which are significantly influenced by anthropogenic activities, we demonstrated that metagenomics-based surveillance of benthic microbial communities is a robust approach to assess anthropogenic impacts on estuarine benthic ecosystems. Correlational and threshold analyses between microbial compositions and environmental conditions indicated that anthropogenic disturbances in the PRE sediments drove the taxonomic and functional variations in the benthic microbial communities. An ecological community threshold of anthropogenic disturbances was identified, which delineated the PRE sediments into two groups (H and L) with distinct taxa and functional traits. Group H, located nearshore and subjected to a higher level of anthropogenic disturbances, was enriched with pollutant degraders, putative human pathogens, fecal pollution indicators, and functional traits related to stress tolerance. In contrast, Group L, located offshore and subjected to a lower level of anthropogenic disturbances, was enriched with halotolerant and oligotrophic taxa and functional traits related to growth and resource acquisition. The machine learning random forest model identified a number of taxonomic and functional indicators that could differentiate PRE sediments between Groups H and L. The identified ecological community threshold and microbial indicators highlight the utility of metagenomics-based microbial surveillance in assessing the adverse impacts of anthropogenic disturbances in estuarine sediments, which can assist environmental management to better protect ecosystem health.
- Research Article
40
- 10.1016/j.cgh.2023.09.023
- Oct 5, 2023
- Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association
- Sushrut Jangi + 6 more
Dynamics of the Gut Mycobiome in Patients With Ulcerative Colitis
- Research Article
3
- 10.3389/fmed.2023.1158005
- May 22, 2023
- Frontiers in Medicine
- Caoyang Fang + 5 more
This study aimed to investigate the predictive value of a clinical nomogram model based on serum YKL-40 for major adverse cardiovascular events (MACE) during hospitalization in patients with acute ST-segment elevation myocardial infarction (STEMI). In this study, 295 STEMI patients from October 2020 to March 2023 in the Second People's Hospital of Hefei were randomly divided into a training group (n = 206) and a validation group (n = 89). Machine learning random forest model was used to select important variables and multivariate logistic regression was included to analyze the influencing factors of in-hospital MACE in STEMI patients; a nomogram model was constructed and the discrimination, calibration, and clinical effectiveness of the model were verified. According to the results of random forest and multivariate analysis, we identified serum YKL-40, albumin, blood glucose, hemoglobin, LVEF, and uric acid as independent predictors of in-hospital MACE in STEMI patients. Using the above parameters to establish a nomogram, the model C-index was 0.843 (95% CI: 0.79-0.897) in the training group; the model C-index was 0.863 (95% CI: 0.789-0.936) in the validation group, with good predictive power; the AUC (0.843) in the training group was greater than the TIMI risk score (0.648), p < 0.05; and the AUC (0.863) in the validation group was greater than the TIMI risk score (0.795). The calibration curve showed good predictive values and observed values of the nomogram; the DCA results showed that the graph had a high clinical application value. In conclusion, we constructed and validated a nomogram based on serum YKL-40 to predict the risk of in-hospital MACE in STEMI patients. This model can provide a scientific reference for predicting the occurrence of in-hospital MACE and improving the prognosis of STEMI patients.
- Research Article
4
- 10.7150/jca.79593
- Jan 1, 2023
- Journal of Cancer
- Victor Chun-Lam Wong + 5 more
Purpose: This study aims to develop liquid biopsy assays for early HCC diagnosis and prognosis. Methods: Twenty-three microRNAs were first consolidated as a panel (HCCseek-23 panel) based on their reported functions in HCC development. Serum samples were collected from 103 early-stage HCC patients before and after hepatectomy. Quantitative PCR and machine learning random forest models were applied to develop diagnostic and prognostic models. Results: For HCC diagnosis, HCCseek-23 panel demonstrated 81% sensitivity and 83% specificity for identifying HCC in the early-stage; it showed 93% sensitivity for identifying alpha-fetoprotein (AFP)-negative HCC. For HCC prognosis, the differential expressions of 8 microRNAs (HCCseek-8 panel: miR-145, miR-148a, miR-150, miR-221, miR-223, miR-23a, miR-374a, and miR-424) were significantly associated with disease-free survival (DFS) (Log-rank test p-value = 0.001). Further model improvement using these HCCseek-8 panel in combination with serum biomarkers (i.e. AFP, ALT, and AST) demonstrated a significant association with DFS (Log-rank p-value = 0.011 and Cox proportional hazards analyses p-value = 0.002). Conclusion: To the best of our knowledge, this is the first report to integrate circulating miRNAs, AST, ALT, AFP, and machine learning for predicting DFS in early HCC patients undergoing hepatectomy. In this setting, HCCSeek-23 panel is a promising circulating microRNA assay for diagnosis, while HCCSeek-8 panel is promising for prognosis to identify early HCC recurrence.
- Research Article
3
- 10.1007/s00374-022-01671-8
- Oct 11, 2022
- Biology and Fertility of Soils
- Loretta G Garrett + 3 more
The fertiliser growth response of planted forests can vary due to differences in site-specific factors like climate and soil fertility. We identified when forest stands responded to a standard, single application of nitrogen (N) fertiliser and employed a machine learning random forest model to test the use of natural abundance stable isotopic N (δ15N) to predict site response. Pinus radiata growth response was calculated as the change in periodic annual increment of basal area (PAI BA) from replicated control and treatment (~ 200 kg N ha−1) plots within trials across New Zealand. Variables in the analysis were climate, silviculture, soil, and foliage chemical properties, including natural abundance δ15N values as integrators of historical patterns in N cycling. Our Random Forest model explained 78% of the variation in growth with tree age and the δ15N enrichment factor (δ15Nfoliage − δ15Nsoil) showing more than 50% relative importance to the model. Tree growth rates generally decreased with more negative δ15N enrichment factors. Growth response to N fertiliser was highly variable. If a response was going to occur, it was most likely within 1–3 years after fertiliser addition. The Random Forest model predicts that younger stands (< 15 years old) with the freedom to grow and sites with more negative δ15N isotopic enrichment factors will exhibit the biggest growth response to N fertiliser. Supporting the challenge of forest nutrient management, these findings provide a novel decision-support tool to guide the intensification of nutrient additions.
- Research Article
3
- 10.1038/s41598-022-16451-5
- Jul 21, 2022
- Scientific Reports
- Toyoshi Inoguchi + 15 more
This study aimed to develop a simplified model for predicting end-stage kidney disease (ESKD) in patients with diabetes. The cohort included 2549 individuals who were followed up at Kyushu University Hospital (Japan) between January 1, 2008 and December 31, 2018. The outcome was a composite of ESKD, defined as an eGFR < 15 mL min−1 [1.73 m]−2, dialysis, or renal transplantation. The mean follow-up was 5.6 pm 3.7 years, and ESKD occurred in 176 (6.2%) individuals. Both a machine learning random forest model and a Cox proportional hazard model selected eGFR, proteinuria, hemoglobin A1c, serum albumin levels, and serum bilirubin levels in a descending order as the most important predictors among 20 baseline variables. A model using eGFR, proteinuria and hemoglobin A1c showed a relatively good performance in discrimination (C-statistic: 0.842) and calibration (Nam and D’Agostino chi2 statistic: 22.4). Adding serum albumin and bilirubin levels to the model further improved it, and a model using 5 variables showed the best performance in the predictive ability (C-statistic: 0.895, chi2 statistic: 7.7). The accuracy of this model was validated in an external cohort (n = 5153). This novel simplified prediction model may be clinically useful for predicting ESKD in patients with diabetes.
- Research Article
8
- 10.1186/s12890-022-01972-6
- May 4, 2022
- BMC Pulmonary Medicine
- Xiao-Hui Yang + 6 more
BackgroundAltered metabolic pathways have recently been considered as potential drivers of idiopathic pulmonary fibrosis (IPF) for the study of drug therapeutic targets. However, our understanding of the metabolite profile during IPF formation is lacking.MethodsTo comprehensively characterize the metabolic disorders of IPF, a mouse IPF model was constructed by intratracheal injection of bleomycin into C57BL/6J male mice, and lung tissues from IPF mice at 7 days, 14 days, and controls were analyzed by pathology, immunohistochemistry, and Western Blots. Meanwhile, serum metabolite detections were conducted in IPF mice using LC–ESI–MS/MS, KEGG metabolic pathway analysis was applied to the differential metabolites, and biomarkers were screened using machine learning algorithms.ResultsWe analyzed the levels of 1465 metabolites and found that more than one-third of the metabolites were altered during IPF formation. There were 504 and 565 metabolites that differed between M7 and M14 and controls, respectively, while 201 differential metabolites were found between M7 and M14. In IPF mouse sera, about 80% of differential metabolite expression was downregulated. Lipids accounted for more than 80% of the differential metabolite species with down-regulated expression. The KEGG pathway enrichment analysis of differential metabolites was mainly enriched to pathways such as the metabolism of glycerolipids and glycerophospholipids. Eight metabolites were screened by a machine learning random forest model, and receiver operating characteristic curves (ROC) assessed them as ideal diagnostic tools.ConclusionsIn conclusion, we have identified disturbances in serum lipid metabolism associated with the formation of pulmonary fibrosis, contributing to the understanding of the pathogenesis of pulmonary fibrosis.
- Research Article
14
- 10.3390/d14030207
- Mar 11, 2022
- Diversity
- Gian Maria Niccolò Benucci + 4 more
The process of fermenting tofu extends back thousands of years and is an indispensable part of Chinese culture. Despite a cultural resurgence in fermented foods and interest in microbiomes, there is little knowledge on the microbial diversity represented in fermented ‘hairy’ tofu, known locally in China as Mao tofu. High-throughput metagenomic sequencing of the ITS, LSU and 16S rDNA was used to determine Mao tofu’s fungal and bacterial community diversity across four wet markets in Yunnan, China. The results show that hairy tofu in this region consists of around 170 fungal and 365 bacterial taxa, and that microbial taxa differ between markets. Diversity also differed based on the specific niche of the tofu block, comparing the outside rind-like niche to that of the inside of the tofu block. Machine learning random forest models were able to accurately classify both the market and niche of sample origin. An over-abundance of yeast and Geotrichum was found, and Mucor (Mucoromycota) was abundant in the outside rind-like niche, which consists of the visible ‘hairy’ mycelium. The majority of the bacterial OTUs belonged to Proteobacteria, Firmicutes, and Bacteroidetes, with Acinetobacter, Lactobacillus, Sphingobacterium and Flavobacterium the most abundant genera. Putative fungal pathogens of plants (Cercospora, Diaporthe, Fusarium) and animals (Metarhizium, Entomomortierella, Pyxidiophora, Candida, Clavispora) were also detected, as were putative bacterial pathogens identified as Legionella. Non-fungal eukaryotic taxa detected by LSU amplicon sequencing included soybean (Glycine max), Protozoa, Metazoa (e.g., Nematoda and Platyhelminthes), Rhizaria and Chromista, indicating that additional biodiversity exists in the hairy tofu microbiome.
- Research Article
6
- 10.3390/rs13173481
- Sep 2, 2021
- Remote Sensing
- Tao Chen + 9 more
The CO2 efflux from forest soil (FCO2) is one of the largest components of the global carbon cycle. Accurate estimation of FCO2 can help us better understand the carbon cycle in forested areas and precisely predict future climate change. However, the scarcity of field-measured FCO2 data in the subtropical forested area greatly limits our understanding of FCO2 dynamics at regional and global scales. This study used an automatic cavity ring-down spectrophotometer (CRDS) analyzer to measure FCO2 in a typical subtropical forest of southern China in the dry season. We found that the measured FCO2 at two experimental areas experienced similar temporal trends in the dry season and reached the minima around December, whereas the mean FCO2 differed apparently across the two areas (9.05 vs. 5.03 g C m−2 day−1) during the dry season. Moreover, we found that both abiotic (soil temperature and moisture) and biotic (vegetation productivity) factors are significantly and positively correlated, respectively, with the FCO2 variation during the study period. Furthermore, a machine-learning random forest model (RF model) that incorporates remote sensing data is developed and used to predict the FCO2 pattern in the subtropical forest, and the topographic effects on spatiotemporal patterns of FCO2 were further investigated. The model evaluation indicated that the proposed model illustrated high prediction accuracy for the training and testing dataset. Based on the proposed model, the spatiotemporal patterns of FCO2 in the forested watershed that encloses the two monitoring sites were mapped. Results showed that the spatial distribution of FCO2 is obviously affected by topography: the high FCO2 values mainly occur in relatively high altitudinal areas, in slopes of 10–25°, and in sunny slopes. The results emphasized that future studies should consider topographical effects when simulating FCO2 in subtropical forests. Overall, our study unraveled the spatiotemporal variations of FCO2 and their driving factors in a subtropical forest of southern China in the dry season, and demonstrated that the proposed RF model in combination with remote sensing data can be a useful tool for predicting FCO2 in forested areas, particularly in subtropical and tropical forest ecosystems.
- Research Article
4
- 10.1097/qad.0000000000002830
- May 1, 2021
- AIDS (London, England)
- Shi Chen + 5 more
Machine learning has the potential to help researchers better understand and close the gap in HIV care delivery in large metropolitan regions such as Mecklenburg County, North Carolina, USA. We aim to identify important risk factors associated with delayed linkage to care for HIV patients with novel machine learning models and identify high-risk regions of the delay. Deidentified 2013-2017 Mecklenburg County surveillance data in eHARS format were requested. Both univariate analyses and machine learning random forest model (developed in R 3.5.0) were applied to quantify associations between delayed linkage to care (>30 days after diagnosis) and various risk factors for individual HIV patients. We also aggregated linkage to care by zip codes to identify high-risk communities within the county. Types of HIV-diagnosing facility significantly influenced time to linkage; first diagnosis in hospital was associated with the shortest time to linkage. HIV patients with lower CD4+ cell counts (<200/ml) were twice as likely to link to care within 30 days than those with higher CD4+ cell count. Random forest model achieved high accuracy (>80% without CD4+ cell count data and >95% with CD4+ cell count data) to predict risk of delay in linkage to care. In addition, we also identified top high-risk zip codes of delayed linkage. The findings helped public health teams identify high-risk communities of delayed HIV care continuum across Mecklenburg County. The methodology framework can be applied to other regions with HIV epidemic and challenge of delayed linkage to care.