Rare microbial taxa as potential drivers of yield variation in sauce-flavor baijiu fermentation: Insights from microecology and machine learning.
Rare microbial taxa as potential drivers of yield variation in sauce-flavor baijiu fermentation: Insights from microecology and machine learning.
- Research Article
14
- 10.1016/j.cub.2024.11.040
- Jan 1, 2025
- Current biology : CB
Number of global change factors alters the relative roles of abundant and rare microbes in driving soil multifunctionality resistance.
- Research Article
11
- 10.3389/fmicb.2022.783371
- May 23, 2022
- Frontiers in Microbiology
The rhizosphere soil microbial community under ice exhibits higher diversity and community turnover in the ice-covered stage. The mechanisms by which community assembly processes shape those patterns are poorly understood in high-latitude wetlands. Based on the 16S rRNA gene and ITS sequencing data, we determined the diversity patterns for the rhizosphere microbial community of two plant species in a seasonally ice-covered wetland, during the ice-covered and ice-free stages. The ecological processes of the community assembly were inferred using the null model at the phylogenetic bins (taxonomic groups divided according to phylogenetic relationships) level. Different effects of ecological processes on rare and abundant microbial sub-communities (defined by the relative abundance of bins) and bins were further analyzed. We found that bacterial and fungal communities had higher alpha and gamma diversity under the ice. During the ice-free stage, the dissimilarity of fungal communities decreased sharply, and the spatial variation disappeared. For the bacterial community, homogeneous selection, dispersal limitation, and ecological processes (undominated processes) were the main processes, and they remained relatively stable across all stages. For the fungal community, during the ice-covered stage, dispersal limitation was the dominant process. In contrast, during the ice-free stage, ecological drift processes were more important in the Scirpus rhizosphere, and ecological drift and homogeneous selection processes were more important in the Phragmites rhizosphere. Regarding the different effects of community assembly processes on abundant and rare microbes, abundant microbes were controlled more by homogeneous selection. In contrast, rare microbes were controlled more by ecological drift, dispersal limitation, and heterogeneous selection, especially bacteria. This is potentially caused by the low growth rates or the intermediate niche breadths of rare microbes under the ice. Our findings suggest the high diversity of microbial communities under the ice, which deepens our understanding of various ecological processes of community assembly across stages and reveals the distinct effects of community assembly processes on abundant and rare microbes at the bin level.
- Research Article
14
- 10.1016/j.jenvman.2024.123638
- Jan 1, 2025
- Journal of environmental management
Soil carbon fractions drive microbial community assembly processes during forest succession.
- Research Article
17
- 10.1128/aem.01973-22
- Jan 19, 2023
- Applied and Environmental Microbiology
Viruses are widespread in various ecosystems, and they play important roles in regulating the microbial community via host-virus interactions. Recently, metagenomic studies showed that there are extremely diverse viruses in different environments from the ocean to the human gut, but the influences of viral communities on microbial communities are poorly understood, especially in extreme environments. Here, we used metagenomics to characterize microbial communities and viral communities in acid mine drainage (AMD) and evaluated how viruses shape microbial community constrained by the harsh environments. Our results showed that AMD viral communities are significantly associated with the microbial communities, and viral diversity has positive correlations with microbial diversity. Viral community explained more variations of microbial community composition than environmental factors in AMD of a polymetallic mine. Moreover, we found that viruses harboring adaptive genes regulate a relative abundance of hosts under the modulation of environmental factors, such as pH. We also observed that viral diversity has significant correlations with the global properties of microbial cooccurrence networks, such as modularity. In addition, the results of null modeling analyses revealed that viruses significantly affect microbial community phylogeny and play important roles in regulating ecological processes of community assembly, such as dispersal limitation and homogenous dispersal. Together, these results revealed that AMD viruses are critical forces driving microbial network and community assembly via host-virus interactions. IMPORTANCE Viruses as mobile genetic elements play critical roles in the adaptive evolution of their hosts in extreme environments. However, how viruses further influence microbial community structure and assembly is still unclear. A recent metagenomic study observed diverse viruses unexplored in acid mine drainage, revealing the associations between the viral community and environmental factors. Here, we showed that viruses together with environmental factors can constrain the relative abundance of host and microbial community assembly in AMD of copper mines and polymetallic mines. Our results highlight the importance of viruses in shaping the microbial community from the individual host level to the community level.
- Research Article
68
- 10.1016/j.compag.2021.106632
- Dec 23, 2021
- Computers and Electronics in Agriculture
Identifying causes of crop yield variability with interpretive machine learning
- Research Article
- 10.3390/microorganisms13081911
- Aug 16, 2025
- Microorganisms
Soil microorganisms play an important role in maintaining the functioning of terrestrial ecosystems. Soil microbial communities usually contain both abundant and rare microorganisms. However, in forest ecosystems, the differences in the functions and assembly processes of abundant and rare microbial taxa in soils between planted pure and mixed forests are currently unknown. In this study, four different forest types in the Zhongtiao Mountains were selected, and the diversity and assembly processes of abundant and rare microbial communities in their soils were quantitatively analyzed. The results show that there are differences in the diversity and assembly processes of abundant and rare microorganisms in the four forests. Significant differences in the α-diversity (Shannon index) of abundant bacteria (p = 0.019) and rare fungi (p = 0.049) were obtained in the four forests. The assembly of abundant bacterial and fungal communities in the four forest types was mainly influenced by stochastic processes, the assembly of rare bacterial communities was mainly influenced by deterministic processes, and the assembly of rare fungal communities was influenced by a combination of deterministic and stochastic processes. Planted mixed forests increase the relative contribution of deterministic processes in the assembly of rare fungal communities compared to planted pure forests. This study determined the relative contributions of deterministic and stochastic processes in the assembly of abundant and rare microbial communities among different forest types, providing a theoretical basis for forest management in mixed forests.
- Research Article
1
- 10.1038/s41598-025-03801-2
- Jun 2, 2025
- Scientific Reports
Accurately predicting the estimated ultimate recovery (EUR) of shale gas from a single well is challenging due to geological, engineering, and production factors. Conventional methods often lack sufficient transparency and clarity in the calculation process. As a result, machine learning (ML) algorithms have proven to be an effective alternative. Still, single algorithms are susceptible to outliers or feature selection in the data, leading to unstable predictions. Based on the concept of ensemble learning, this study proposes an intelligent method utilizing automated feature engineering (AutoFE) and stacking ensemble techniques. The method employs Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost) as base learners, with Logistic Regression (LR) as the meta-learner. Furthermore, the model is optimized using the Tree-structured Parzen Estimator (TPE) algorithm. The proposed stacking ensemble learning model was validated using a publicly available dataset comprising 506 groups of EUR data of shale gas. The results demonstrate that the proposed stacking ensemble model outperforms individual machine learning models, achieving an R2 of 0.9456, an RMSE of 0.7432, and a MAPE as low as 4.36%. Furthermore, paired t-test results indicate that the use of AutoFE significantly enhances the predictive performance of the model. Furthermore, to enhance the interpretability of the prediction results, the Shapley Additive Explanations (SHAP) technique was employed to conduct an explainable analysis of the machine learning models. This approach revealed the influence trends and magnitudes of reservoir parameters and based learners on the prediction outcomes. The results further indicate that lateral length is the primary factor affecting EUR, followed by proppant loading. This study accurately predicts shale gas EUR and identifies key factors influencing the prediction results, providing valuable insights for predicting shale gas reservoir parameters and optimizing development plans.
- Research Article
8
- 10.1177/03000605241239013
- Mar 1, 2024
- Journal of International Medical Research
We identified predictive factors and developed a novel machine learning (ML) model for predicting mortality risk in patients with sepsis-associated encephalopathy (SAE). In this retrospective cohort study, data from the Medical Information Mart for Intensive Care IV (MIMIC-IV) and eICU Collaborative Research Database were used for model development and external validation. The primary outcome was the in-hospital mortality rate among patients with SAE; the observed in-hospital mortality rate was 14.74% (MIMIC IV: 1112, eICU: 594). Using the least absolute shrinkage and selection operator (LASSO), we built nine ML models and a stacking ensemble model and determined the optimal model based on the area under the receiver operating characteristic curve (AUC). We used the Shapley additive explanations (SHAP) algorithm to determine the optimal model. The study included 9943 patients. LASSO identified 15 variables. The stacking ensemble model achieved the highest AUC on the test set (0.807) and 0.671 on external validation. SHAP analysis highlighted Glasgow Coma Scale (GCS) and age as key variables. The model (https://sic1.shinyapps.io/SSAAEE/) can predict in-hospital mortality risk for patients with SAE. We developed a stacked ensemble model with enhanced generalization capabilities using novel data to predict mortality risk in patients with SAE.
- Research Article
9
- 10.1016/j.compbiolchem.2024.108248
- Oct 15, 2024
- Computational Biology and Chemistry
An efficient interpretable stacking ensemble model for lung cancer prognosis
- Research Article
9
- 10.1016/j.jenvman.2025.125478
- May 1, 2025
- Journal of environmental management
Prediction of gully erosion susceptibility through the lens of the SHapley Additive exPlanations (SHAP) method using a stacking ensemble model.
- Research Article
10
- 10.3389/fmars.2022.856126
- May 4, 2022
- Frontiers in Marine Science
Unraveling the assembly mechanism is a core research topic of microbial ecology. Abundant and rare microbial communities are crucial for diversity, function and host health in a given ecosystem, but few studies focused on their assembly strategies. Here, we explored the microbial diversity of abundant and rare communities of water, shrimp intestine and sediment habitats in the shrimp cultural ponds. Our results found that the numbers of rare operational taxonomic units (OTUs) (6,003, 4,566 and 8,237 OTUs of water, intestine and sediment) was dozens of times more than abundant ones (only 199, 157 and 122 OTUs of water, intestine and sediment). The community diversity of abundant and rare microbial taxa was markedly different, as well as their taxonomic composition. Despite different diversity, similar abundance-occupancy relationship and biogeographic patterns between the abundant and rare microbial communities were observed, with much stronger obvious distance-decay relationships for rare community than abundant community. Furthermore, stochastic processes dominated the community assemblies of both abundant and rare microbial taxa, and deterministic process contributed more microbial community variation to rare taxa than abundant taxa. All the findings advance our understanding on the community assembly strategies of abundant and rare microbial taxa and prompt the contributions of abundant and rare microbial community to the aquatic ecosystems, which will improve aquaculture management strategy.
- Research Article
6
- 10.1002/cmdc.202300151
- Jun 30, 2023
- ChemMedChem
Prediction of IDO1 Inhibitors by a Fingerprint-Based Stacking Ensemble Model Named IDO1Stack.
- Research Article
1
- 10.3390/cancers17121974
- Jun 13, 2025
- Cancers
Background: Radiation therapy is a primary and cornerstone treatment modality for brain metastasis. However, it can result in complications like necrosis, which may lead to significant neurological deficits. This study aims to develop and validate an ensemble model with radiomics to predict radiation necrosis. Method: This study retrospectively collected and analyzed MRI images and clinical information from 209 stereotactic radiosurgery sessions involving 130 patients with brain metastasis. An ensemble model integrating gradient boosting, random forest, decision tree, and support vector machine was developed and validated using selected radiomic features and clinical factors to predict the likelihood of necrosis. The model performance was evaluated and compared with other machine learning algorithms using metrics, including the area under the curve (AUC), sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). SHapley Additive exPlanations (SHAP) analysis and local interpretable model-agnostic explanations (LIME) analysis were applied to explain the model's prediction. Results: The ensemble model achieved strong performance in the validation cohort, with the highest AUC. Compared to individual models and the stacking ensemble model, it consistently outperformed. The model demonstrated superior accuracy, generalizability, and reliability in predicting radiation necrosis. SHAP and LIME were used to interpret a complex predictive model for radiation necrosis. Both analyses highlighted similar significant factors, enhancing our understanding of prediction dynamics. Conclusions: The ensemble model using radiomic features exhibited high accuracy and robustness in predicting the occurrence of radiation necrosis. It could serve as a novel and valuable tool to facilitate radiotherapy for patients with brain metastasis.
- Research Article
- 10.1158/1538-7445.am2025-4676
- Apr 21, 2025
- Cancer Research
This study aims to apply an ensemble model integrating MRI-based radiomics and clinical information as a reliable tool for precisely predicting radiation necrosis, a severe consequence of radiation therapy for brain metastases that requires accurate early prediction to improve patient management. We retrospectively collected and analyzed MRI images and clinical information from 209 stereotactic radiosurgery sessions involving 130 patients with brain metastasis. Radiomic features were extracted from MRI using PyRadiomics and selected via L2 regularization and coefficient analysis. An ensemble model integrating gradient boosting, random forest, decision tree, and support vector machine as base regressors was developed using a soft voting approach to generate the final prediction of the likelihood of necrosis. Performance was assessed and compared with other machine-learning algorithms using metrics including the area under the curve (AUC), sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). SHapley Additive exPlanations (SHAP) and local interpretable model-agnostic explanations (LIME) analyses were applied to explain the model's prediction. The soft-voting ensemble model demonstrated strong performance in the validation cohort, achieving the highest AUC of 0.873 (95%CI: 0.672 - 1.000). It consistently outperformed individual models and the stacking ensemble model, exhibiting superior accuracy, generalizability, and reliability in predicting radiation necrosis. Both univariate and multivariate logistic regression analyses were performed and confirmed the model as the strongest predictor. SHAP and LIME analyses were employed to interpret the predictive model for radiation necrosis, identifying metastasis volume and the radiomic feature, log-sigma-1-mm_glcd_ldmn, as key predictors. Both analyses highlighted similar significant factors, enhancing the understanding of prediction dynamics. This MRI-based radiomics ensemble model exhibited high accuracy and robustness in predicting radiation necrosis. It has the potential to serve as a novel and valuable tool to facilitate radiotherapy for patients with brain metastasis. Citation Format: Yijun Chen, Corbin A. Helis, Christina K. Cramer, Michael T. Munley, Fei Xing, Qing Lyu, Christopher T. Whitlow, Jeffrey Willey, Michael D. Chan, Yuming Jiang. MRI-based radiomics ensemble model for predicting radiation necrosis in brain metastasis patients treated with stereotactic radiosurgery and immunotherapy [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2025; Part 1 (Regular Abstracts); 2025 Apr 25-30; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2025;85(8_Suppl_1):Abstract nr 4676.
- Research Article
1
- 10.1002/sae2.70064
- Apr 21, 2025
- Journal of Sustainable Agriculture and Environment
ABSTRACTRecent studies have highlighted the significant role of tree species' mycorrhizal traits on forest soil microbial communities and their associated ecosystem functions. However, our understanding of how tree species richness in mono‐mycorrhizal (arbuscular mycorrhiza [AM] or ectomycorrhiza [EcM]) or mixed‐mycorrhizal (AM and EcM = AE) stands affects the rooting zone microbial community assembly processes remains limited. We investigated this knowledge gap using the MyDiv tree diversity experiment, which comprises plantings of AM and EcM tree species and their mixture in one‐, two‐, and four‐species plots. Soil microbiomes in the target tree rooting zone were analyzed using meta‐barcoding of the fungal ITS2 and bacterial 16S V4 rRNA regions. We examined the effects of plot mycorrhizal type, tree species identity and richness on microbial diversity, community composition, and microbial community assembly processes. We found that AM plots exhibited higher fungal richness compared to EcM and mixed mycorrhizal type (AE) plots, whereas tree species identity and diversity showed no significant impact on fungal and bacterial alpha diversity within mono and mixed mycorrhizal type plots. The soil fungal community composition was shaped by tree species identity, tree diversity, and plot mycorrhizal type, while bacterial community composition was only affected by tree species identity. EcM tree species significantly impacted both soil fungal and bacterial community compositions. Plot mycorrhizal type and tree species richness displayed interactive effects on the fungal and bacterial community composition, with AM and EcM plots displaying contrasting patterns as tree diversity increased. Our results suggest that both stochastic and deterministic processes shape microbial community assemblage in mono and mixed mycorrhizal type tree communities. The importance of deterministic processes decreases from AM to EcM plots primarily due to homogeneous selection, while stochastic processes increase, mainly due to dispersal limitation. Stochastic processes affected fungal and bacterial community assembly differently, through dispersal limitation and homogenous dispersal, respectively. In fungi, the core, intermediate and rare abundance fungal taxa were mainly controlled by both stochastic and deterministic processes whereas bacterial communities were dominantly shaped by stochastic processes. These findings provide valuable insights into the role of tree species identity, diversity and mycorrhizal type mixture on the soil microbiome community composition and assembly processes, highlighting the differential impacts on core and rare microbial taxa. Understanding the balance between deterministic and stochastic processes can help forest ecosystem management by predicting microbial community responses to land‐use and environmental changes and influencing ecosystem functions critical for ecosystem health and productivity.