Hum-PLoc: A novel ensemble classifier for predicting human protein subcellular localization

Kuo-Chen Chou,Hong-Bin Shen

doi:10.1016/j.bbrc.2006.06.059

Abstract

Predicting subcellular localization of human proteins is a challenging problem, especially when unknown query proteins do not have significant homology to proteins of known subcellular locations and when more locations need to be covered. To tackle the challenge, protein samples are expressed by hybridizing the gene ontology (GO) database and amphiphilic pseudo amino acid composition (PseAA). Based on such a representation frame, a novel ensemble classifier, called “Hum-PLoc”, was developed by fusing many basic individual classifiers through a voting system. The “engine” of these basic classifiers was operated by the KNN ( K-nearest neighbor) rule. As a demonstration, tests were performed with the ensemble classifier for human proteins among the following 12 locations: (1) centriole; (2) cytoplasm; (3) cytoskeleton; (4) endoplasmic reticulum; (5) extracell; (6) Golgi apparatus; (7) lysosome; (8) microsome; (9) mitochondrion; (10) nucleus; (11) peroxisome; (12) plasma membrane. To get rid of redundancy and homology bias, none of the proteins investigated here had ⩾25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the jackknife cross-validation test and independent dataset test were 81.1% and 85.0%, respectively, which are more than 50% higher than those obtained by the other existing methods on the same stringent datasets. Furthermore, an incisive and compelling analysis was given to elucidate that the overwhelmingly high success rate obtained by the new predictor is by no means due to a trivial utilization of the GO annotations. This is because, for those proteins with “subcellular location unknown” annotation in Swiss-Prot database, most (more than 99%) of their corresponding GO numbers in GO database are also annotated with “cellular component unknown”. The information and clues for predicting subcellular locations of proteins are actually buried into a series of tedious GO numbers, just like they are buried into a pile of complicated amino acid sequences although with a different manner and “depth”. To dig out the knowledge about their locations, a sophisticated operation engine is needed. And the current predictor is one of these kinds, and has proved to be a very powerful one. The Hum-PLoc classifier is available as a web-server at http://202.120.37.186/bioinf/hum.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hum-PLoc: A novel ensemble classifier for predicting human protein subcellular localization

Abstract

Talk to us

Similar Papers

More From: Biochemical and Biophysical Research Communications

Lead the way for us

Journal: Biochemical and Biophysical Research Communications	Publication Date: Jun 21, 2006
Citations: 282

Similar Papers

Predicting Eukaryotic Protein Subcellular Location by Fusing Optimized Evidence-Theoretic K-Nearest Neighbor Classifiers
Kuo-Chen Chou ... Hong-Bin Shen
Journal of Proteome Research | VOL. 5
Kuo-Chen Chou, et. al.Kuo-Chen Chou ... Hong-Bin Shen
14 Jul 2006
Journal of Proteome Research | VOL. 5

Euk-PLoc: an ensemble classifier for large-scale eukaryotic protein subcellular location prediction
H.-B Shen ... J Yang
Amino Acids | VOL. 33
H.-B Shen, et. al.H.-B Shen ... J Yang
19 Jan 2007
Amino Acids | VOL. 33

Prediction of Protein Subcellular Multi-Localization Based on the General form of Chou’s Pseudo Amino Acid Composition
Li-Qi Li ... Yuan Zhang
Protein & Peptide Letters | VOL. 19
Li-Qi Li, et. al.Li-Qi Li ... Yuan Zhang
01 Apr 2012
Protein & Peptide Letters | VOL. 19

ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization
Wen-Lin Huang ... Shinn-Ying Ho
BMC Bioinformatics | VOL. 9
Wen-Lin Huang, et. al.Wen-Lin Huang ... Shinn-Ying Ho
01 Feb 2008
BMC Bioinformatics | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hum-PLoc: A novel ensemble classifier for predicting human protein subcellular localization

Abstract

Talk to us

Similar Papers

More From: Biochemical and Biophysical Research Communications