Expansin refers to a family of proteins present in the plant cell wall which has important roles in plant cell growth, emergence of root hairs, meristem function and other developmental processes. A major constraint to rice production is submergence of rice by flash flooding. In our earlier study, we had identified 21 novel sequences related to expansin gene families in the genome of indica rice using genome-wide analysis. Development of a tool for the prediction of these expansin genes using computational approaches might significantly enhance rice gene annotation. ExpansinPred, a novel computational method based on radial basis function (RBF) and support vector machines (SVMs) for prediction of α-expansins (EXPA) and β-expansins (EXPB), is presented in this work. Two large families of expansin genes have been discovered in plants, namely EXPA and EXPB. The experimental data are curated from NCBI and include 24 EXPA and 20 EXPB, of indica rice, after redundancy elimination. The proper window length for a potential expansin was optimized as 4 for EXPA and EXPB with prediction accuracies 100 % each for both classifiers for RBF classifier. For SVM, the window length was optimized as 3 for EXPA and 4 for EXPB with prediction accuracies 90 and 100 %, respectively. To evaluate the prediction performance of ExpansinPred, cross-validation, independent dataset validation and jackknife validation were carried out. ExpansinPred was also compared with four more algorithms namely Naive Bayes, sequential minimal optimization, J48 and random forest. To further prove that species-specific predictor is much better than general tool, ExpansinPred was compared with an All-plant tool and also with plants other than rice as test set. The different statistical analyses carried out demonstrated that the proposed algorithm is a useful computational tool for rice genome annotation, specifically for predicting expansin gene family, and can benefit rice research community.
Read full abstract