Abstract
A systematic introduction has been presented for the recent advances in predicting protein subcellular localization in the multi-label systems, where the constituent proteins may simultaneously occur or move between two or more location sites and hence have exceptional biological functions worthy of our special notice. All the predictors included in this review each have a user-friendly web-server, by which the majority of experimental scientists can very easily acquire their desired data without the need to go through the complicated mathematics involved.
Highlights
As elucidated in two recent comprehensive review papers [1, 2], to develop a really useful bioinformatics tool, one needs to observe the guidelines of the Chou’s 5-steps rule [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36] to go through the following five steps: 1) select or construct a valid benchmark dataset to train and test the predictor; 2) represent the samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; 3) introduce or develop a powerful algorithm to conduct the prediction; 4) properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; 5) establish a user-friendly web-server for the predictor that is accessible to the public
The protein samples in the iLoc- series [49,50,51,52,53,54,55] were formulated by incorporating the GO information and PSSM information into the general PseAAC
The development of protein subcellular location prediction can be separated into two stages
Summary
As elucidated in two recent comprehensive review papers [1, 2], to develop a really useful bioinformatics tool, one needs to observe the guidelines of the Chou’s 5-steps rule [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36] to go through the following five steps: 1) select or construct a valid benchmark dataset to train and test the predictor; 2) represent the samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; 3) introduce or develop a powerful algorithm to conduct the prediction; 4) properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; 5) establish a user-friendly web-server for the predictor that is accessible to the public. This is just like the case of many machine-learning algorithms. They can be used in most the areas of statistical analysis
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have