Abstract

X-ray crystallography is the most popular approach for analyzing protein 3D structure. However, the success rate of protein crystallization is very low (2-10 percent). To reduce the cost of time and resources, lots of computation-based methods are developed to detect the protein crystallization. Improving the accuracy of predicting protein crystallization is very important for the determination of protein structure by X-ray crystallography. At present, many machine learning methods are used to predict protein crystallization. In this article, we propose a Fuzzy Support Vector Machine based on Linear Neighborhood Representation (FSVM-LNR) to predict the crystallization propensity of proteins. Proteins are represented by three types of features (PsePSSM, PSSM-DWT, MMI-PS), and these features are serially combined and fed into FSVM-LNR. FSVM-LNR can filter outliers by membership score, which is calculated via reconstruction residuals of k nearest samples. To evaluate the performance of our predictive model, we test FSVM-LNR on the datasets of TRAIN3587, TEST3585 and TEST500. Our method achieves better Mathew's correlation coefficient (MCC) on TRAIN3587 (MCC: 0.56) and TEST3585 (MCC: 0.58). Although the performance of independent test is not the best on TEST500, FSVM-LNR also has a certain predictability (MCC: 0.70) in the identification of protein crystallization. The good performance on the datasets proves the effectiveness of our method and the better performance on large datasets further demonstrates the stability and superiority of our method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.