Abstract

To determine protein folding, accurately predicting the connectivity pattern of disulfide bridges can significantly reduce the search space, helping to solving the protein-folding problem. Therefore, developing an effective means of predicting disulfide connectivity patterns facilitates the estimation of the three-dimensional structure of a protein and its function. To our knowledge, with the prior knowledge of the bonding states of cysteines, the highest accuracy rate in the literature for predicting the overall disulfide connectivity pattern (Qp) is 74.4% for dataset SP39. Dataset SP39 is conventionally adopted to predict disulfide connectivity. This work presents a novel classifier based on the support vector machine (SVM) that incorporates features of position-specific scoring matrix (PSSM), normalized bond lengths, the predicted secondary structure of protein, and indices for the physicochemical properties of amino acid. The support vector machine is trained to derive the connectivity probabilities of cysteine pairs. Additionally, an evolutionary algorithm called the multiple trajectory search (MTS) is integrated with the SVM model to tune the SVM parameters and window sizes for the above features. Moreover, the disulfide connectivity pattern is identified by using the maximum weight perfect matching algorithm. Experimental results indicate that the accuracy rate for predicting the overall disulfide connectivity pattern (Qp) reaches 79.8% when tested using the same dataset SP39.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call