Multi-label Learning for Protein Subcellular Location Prediction

Xiao Wang,Jia-Ming Liu,Rui-Wei Zhao,Guo-Zheng Li

doi:10.1109/bibm.2011.36

Abstract

Protein subcellular localization aims at predicting the location of a protein within a cell using computational methods. Knowledge of subcellular localization of proteins indicates protein functions and helps in identifying drug targets. Prediction of protein subcellular localization is an important but challenging problem, particularly when proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular localization methods are only used to deal with the single-location proteins. To better reflect the characteristics of multiplex proteins, we formulate prediction of subcellular localization of multiplex proteins as a multi-label learning problem. We present and compare two multi-label learning approaches, which exploit correlations between labels and leverage label-specific features, respectively, to induce a high quality prediction model. Experimental results on six protein data sets under various organisms show that our described methods achieve significantly higher performance than any of the existing methods. Among the different multi-label learning methods, we find that methods exploiting label correlations performs better than those leveraging label-specific features.

Full Text