Cyanobacteria have developed acclimation strategies to adapt to harsh environments, making them a model organism. Understanding the molecular mechanisms of tolerance to abiotic stresses can help elucidate how cells change their gene expression patterns in response to stress. Recent advances in sequencing techniques and bioinformatics analysis methods have led to the discovery of many genes involved in stress response in organisms. The Synechocystis sp. PCC 6803 is a suitable microorganism for studying transcriptome response under environmental stress. Therefore, for the first time, we employed two effective feature selection techniques namely and support vector machine recursive feature elimination (SVM-RFE) and LASSO (Least Absolute Shrinkage Selector Operator) to pinpoint the crucial genes responsive to environmental stresses in Synechocystis sp. PCC 6803. We applied these algorithms of machine learning to analyze the transcriptomic data of Synechocystis sp. PCC 6803 under distinct conditions, encompassing light, salt and iron stress conditions. Seven candidate genes namely sll1862, slr0650, sll0760, slr0091, ssl3044, slr1285, and slr1687 were selected by both LASSO and SVM-RFE algorithms. RNA-seq analysis was performed to validate the efficiency of our feature selection approach in selecting the most important genes. The RNA-seq analysis revealed significantly high expression for five genes namely sll1862, slr1687, ssl3044, slr1285, and slr0650 under ion stress condition. Among these five genes, ssl3044 and slr0650 could be introduced as new potential candidate genes for further confirmatory genetic studies, to determine their roles in their response to abiotic stresses.
Read full abstract