Abstract

In traditional wet experiments, fluorescent proteins are generally used to detect subcellular localization of protein. However, it is time consuming and expensive for detecting large-scale biological data. Many computational biological methods have been developed to identify various subcellular localizations of proteins. In the last ten years, machine learning methods have been widely used in many research issues in the field of bioinformatics. In this work, Fuzzy Support Vector Machine based on Kernelized Neighborhood Representation (FSVM-KNR) is proposed to predict the subcellular localization of protein. Proteins are represented via six types of features (PsePSSM, PSSM-DWT, PSSM-AB, PsePP, PP-DWT and PP-AB). These features are constructed kernels and combined with Kernel Target Alignment-based Multiple Kernel Learning (KTA-MKL). Then, Kernelized Neighborhood Representation (KNR) algorithm is proposed to filter outliers via fuzzy membership scores. At last, the membership scores (with KNR) and integrated kernel (with KTA-MKL) are used to built FSVM-KNR model. To evaluate the performance of FSVM-KNR model, we test it on two benchmark datasets of protein subcellular localization. Our method achieves better performance (average precision: 0.7108 and 0.6916) on two datasets, respectively. In addition, our model is also compared with other FSVM model on 8 UCI datasets and the performance of FSVM-KNR is better or comparable.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call