Abstract

Machine Learning (ML) serves as a potent tool for data mining and predictive analytics in genomic research. However, its application in identifying stress-responsive genes remains underexplored. This study identified distinct variations in the expression patterns of one-to-one homologous genes responding to cold stress in three cotton species: Gossypium hirsutum, Gossypium barbadense, and Gossypium arboreum. To better understand cold-responsive genes, we developed ML predictive models (LightGBM, XGBoost, and Random Forest) utilizing 121 biochemical features. The incorporating of these features significantly enhanced model accuracy. Moreover, incorporating evolutionary information further refined the models, achieving an impressive 80.80% accuracy in predicting cold-stress responsive genes. Notably, models trained on sequence features from G. hirsutum showed transferability to the closely related species of G. barbadense, with accuracies ranging from 78.65% to 83.04%. This research presents a promising workflow for identifying candidate genes for experimental exploration of cold stress responses and establishes a systematic framework for predicting cold-stress related genes using ML methodologies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.