Abstract

Much attention has been paid to the technically research and practical application of prediction of protein subcellular location since a great number of previous works by researchers proved the close relationship between protein function and its location as well as human genome project successfully completed over last decades. With rapid progress of computer's calculating speed, computational intelligence method dominates in the prediction of protein subcellular location. In our study, we chose pseudo amino acid (PseAA) model to extract features from protein primitive sequence as the input of classifier. Based on evolutionary fuzzy k-nearest neighbor algorithm (EFKNN), we trained and established six base classifiers with adopting totally different k-values that play an important role in the procedure of training and classifying. In accordance with the outputs of the six base classifiers, a novel ensemble approach named accumulative vote quantity (AVQ) to integrating each output is proposed. For the sake of verifying the effectiveness of our proposed method, we adopted benchmark dataset constructed by Jennifer L. Gardy and Fiona S.L. Brinkman in 2006 as training set whose five subcellular locations were taken from gram-negative bacterial. Simulating test by jackknife test results on dataset is 80.0%, which indicates that our proposed method can be considered to be a powerful prediction tool, or, to some extent, give complementary part to present prediction method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call