Barley (Hordeum vulgare L.) is exposed to various biotic and abiotic stresses, making it crucial to fully understand the gene signatures that respond to stress. This study utilizes machine learning to analyze transcriptomic data from 515 RNA-seq profiles across 18 independent studies, covering eleven abiotic and three biotic stress types. Through meticulous data preprocessing, including quality assessment and batch effect correction, we have identified 4,311 genes for further analysis. Feature selection was performed using five weighting algorithms, resulting in the prioritization of 400 core genes. Machine learning models, specifically Random Forest and C4.5, were optimized and evaluated using a 10-fold cross-validation approach. The C4.5 algorithm demonstrated superior accuracy in predicting stress-responsive signatures. Key genes, such as bHLH119 and E3 ubiquitin protein ligase DRIP2, were identified as potential biomarkers. Functional enrichment analysis, conducted through protein-protein interaction networks and Gene Ontology/KEGG pathway analysis, has revealed significant involvement in lipid biosynthesis, signal transduction, and defense response processes. These findings highlight the crucial roles of the identified biomarkers genes in barley's resilience to stress and provide potential targets for genetic improvement. Future research should focus on validating these biomarkers in different barley cultivars and under field conditions to enhance crop resilience against stressors.
Read full abstract