An Insightful Recollection for Predicting Protein Subcellular Locations in Multi-Label Systems

Kuo-Chen Chou

doi:10.4236/ns.2020.127036

Abstract

A systematic introduction has been presented for the recent advances in predicting protein subcellular localization in the multi-label systems, where the constituent proteins may simultaneously occur or move between two or more location sites and hence have exceptional biological functions worthy of our special notice. All the predictors included in this review each have a user-friendly web-server, by which the majority of experimental scientists can very easily acquire their desired data without the need to go through the complicated mathematics involved.

Highlights

As elucidated in two recent comprehensive review papers [1, 2], to develop a really useful bioinformatics tool, one needs to observe the guidelines of the Chou’s 5-steps rule [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36] to go through the following five steps: 1) select or construct a valid benchmark dataset to train and test the predictor; 2) represent the samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; 3) introduce or develop a powerful algorithm to conduct the prediction; 4) properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; 5) establish a user-friendly web-server for the predictor that is accessible to the public
The protein samples in the iLoc- series [49,50,51,52,53,54,55] were formulated by incorporating the GO information and PSSM information into the general PseAAC
The development of protein subcellular location prediction can be separated into two stages

Summary

INTRODUCTION

As elucidated in two recent comprehensive review papers [1, 2], to develop a really useful bioinformatics tool, one needs to observe the guidelines of the Chou’s 5-steps rule [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36] to go through the following five steps: 1) select or construct a valid benchmark dataset to train and test the predictor; 2) represent the samples with an effective formulation that can truly reflect their intrinsic correlation with the target to be predicted; 3) introduce or develop a powerful algorithm to conduct the prediction; 4) properly perform cross-validation tests to objectively evaluate the anticipated prediction accuracy; 5) establish a user-friendly web-server for the predictor that is accessible to the public. This is just like the case of many machine-learning algorithms. They can be used in most the areas of statistical analysis

PREDICTING SUBCELLULAR LOCALIZATION OF PROTEINS

FOUR SERIES OF PREDICTORS

Benchmark Dataset

Sample Formulation

Operation Engine

Metrics and Cross-Validation

Cross-Validation and Jackknife Test

Web Servers

CONCLUSIONS AND PERSPECTIVE

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Natural science	Publication Date: Jan 1, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Insightful Recollection for Predicting Protein Subcellular Locations in Multi-Label Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural science

Lead the way for us

Similar Papers

PLoc_bal-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC
Xiang Cheng ... Kuo-Chen Chou
Journal of theoretical biology | VOL. 458
Xiang Cheng, et. al.Xiang Cheng ... Kuo-Chen Chou
08 Sep 2018
Journal of theoretical biology | VOL. 458

PLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset.
Kuo-Chen Chou ... Xiang Cheng
Medicinal Chemistry | VOL. 15
Kuo-Chen Chou, et. al.Kuo-Chen Chou ... Xiang Cheng
02 Jul 2019
Medicinal Chemistry | VOL. 15

PLoc_bal-mGpos: Predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC
Xuan Xiao ... Kuo-Chen Chou
Genomics | VOL. 111
Xuan Xiao, et. al.Xuan Xiao ... Kuo-Chen Chou
26 May 2018
Genomics | VOL. 111

PLoc_bal-mAnimal: predict subcellular localization of animal proteins by balancing training dataset and PseAAC.
Xiang Cheng ... John Hancock
Computer applications in the biosciences : CABIOS | VOL. 35
Xiang Cheng, et. al.Xiang Cheng ... John Hancock
13 Jul 2018
Computer applications in the biosciences : CABIOS | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Insightful Recollection for Predicting Protein Subcellular Locations in Multi-Label Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Natural science