Abstract
The knowledge of subnuclear localization in eukaryotic cells is indispensable for under-standing the biological function of nucleus, genome regulation and drug discovery. In this study, a new feature representation was pro-posed by combining position specific scoring matrix (PSSM) and auto covariance (AC). The AC variables describe the neighboring effect between two amino acids, so that they incorpo-rate the sequence-order information; PSSM de-scribes the information of biological evolution of proteins. Based on this new descriptor, a support vector machine (SVM) classifier was built to predict subnuclear localization. To evaluate the power of our predictor, the benchmark dataset that contains 714 proteins localized in nine subnuclear compartments was utilized. The total jackknife cross validation ac-curacy of our method is 76.5%, that is higher than those of the Nuc-PLoc (67.4%), the OET- KNN (55.6%), AAC based SVM (48.9%) and ProtLoc (36.6%). The prediction software used in this article and the details of the SVM parameters are freely available at http://chemlab.scu.edu.cn/ predict_SubNL/index.htm and the dataset used in our study is from Shen and Chou’s work by downloading at http://chou.med.harvard.edu/ bioinf/Nuc-PLoc/Data.htm.
Highlights
The cell nucleus is complex, important subcellular organelle in eukaryotes cell
As elucidated in [14] and demonstrated by Eq.50 of [7], among the three cross-validation methods, the jackknife test is deemed the most objective that can always yield a unique result for a given benchmark dataset, and has been increasingly used by tigators to examine the accuracy of various predictors
Compared to the existing methods, our classifier combined with position specific scoring matrix (PSSM) and auto covariance (AC) has further improved the prediction accuracy of protein subnuclear localization
Summary
The cell nucleus is complex, important subcellular organelle in eukaryotes cell It organizes the comprehensive assembly of our genes and their corresponding regulatory factors [1]. It’s desirable to get the knowledge of protein subnuclear localization for indepth understanding cell biological processes and genomic regulation. It is of great practical significance to develop computational approaches for identifying the protein subnuclear localizations in cell nucleus. The PsePSSM was proposed by Shen and Chou in order to incorporate the evolution information of proteins [44]. They built a new web server called Nuc-PLoc for predicting protein subnuclear localization by fusing PseAA composition and PsePSSM with a promising prediction result. The result indicates that our method has successfully enhanced accuracies of the existing methods for predicting protein subnuclear localization
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.