Abstract
BackgroundWe present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.ResultsResults have shown that the best performing neural network is the one that uses input information comprising of PSI-BLAST [1] profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested in discriminating the native structure from a set of decoys for all decoy datasets tested.ConclusionThis method is demonstrated to be viable, and furthermore evolutionary information is successfully used in the neural networks to improve decoy discrimination.
Highlights
We present a novel method of protein fold decoy discrimination using machine learning, using neural networks
The NN-solvpairndist method performs slightly better than the NN-dist method, while the K Nearest Neighbours method (K = 10) has an overall Z score which is slightly higher than the NN-solvpairndist method
We have demonstrated the viability of using machine learning, neural networks, to perform decoy discrimination
Summary
We present a novel method of protein fold decoy discrimination using machine learning, using neural networks. Decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks. Protein structure prediction aims to bridge the gap between the number of such sequences and the number of sequences with experimentally determined structures. One advantage of computational protein structure prediction is that accurate in silico protein modelling can help guide the more expensive experimental efforts in protein structure determination. If a target sequence has templates in the structure databases, comparative modelling and fold recognition methods are used to select the templates. Large numbers of (page number not for citation purposes)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.