Abstract

In this paper, we propose a strategy to predict the subcellular locations of proteins by combining various feature selection methods. Firstly, proteins are coded by amino-acid composition and physicochemical properties, then these features are arranged by Minimum Redundancy Maximum Relevance method and further filtered by feature selection procedure. Nearest Neighbor Algorithm is used as a prediction model to predict the protein subcellular locations, and gains a correct prediction rate of 70.63%, evaluated by Jackknife cross-validation. Results of feature selection also enable us to identify the most important protein properties. The prediction software is available for public access on the website http://chemdata.shu.edu.cn/sub22/, which may play a important complementary role to a series of web-server predictors summarized recently in a review by Chou and Shen (Chou, K.C., Shen, H.B. Natural Science, 2009, 2, 63-92, http://www.scirp.org/journal/NS/).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call