Abstract
Prediction of protein residue contacts, even at the coarse-grain level, can help in finding solutions to the protein structure prediction problem. Unlike α-helices that are locally stabilized, β-sheets result from pairwise hydrogen bonding of two or more disjoint regions of the protein backbone. The problem of predicting contacts among β-strands in proteins has been addressed by several supervised computational approaches. Recently, prediction of residue contacts based on correlated mutations has been greatly improved and finally allows the prediction of 3D structures of the proteins. In this article, we describe BCov, which is the first unsupervised method to predict the β-sheet topology starting from the protein sequence and its secondary structure. BCov takes advantage of the sparse inverse covariance estimation to define β-strand partner scores. Then an optimization based on integer programming is carried out to predict the β-sheet connectivity. When tested on the prediction of β-strand pairing, BCov scores with average values of Matthews Correlation Coefficient (MCC) and F1 equal to 0.56 and 0.61, respectively, on a non-redundant dataset of 916 protein chains known with atomic resolution. Our approach well compares with the state-of-the-art methods trained so far for this specific task. The method is freely available under General Public License at http://biocomp.unibo.it/savojard/bcov/bcov-1.0.tar.gz. The new dataset BetaSheet1452 can be downloaded at http://biocomp.unibo.it/savojard/bcov/BetaSheet1452.dat.
Highlights
In this article, we describe BCov, which is the first unsupervised method to predict the b-sheet topology starting from the protein sequence and its secondary structure
B-Sheets are widespread motifs of local structure found in over 80% of the protein structures presently available in the Protein Data Bank. b-Sheets are generated by the pairing of two or more b-strands held together by characteristic patterns of hydrogen bonds running in a parallel or antiparallel fashion (Zhang and Kim, 2000)
We describe BCov, a new approach for b-sheet topology prediction based on sparse inverse covariance estimation and integer programming
Summary
B-Sheets are widespread motifs of local structure found in over 80% of the protein structures presently available in the Protein Data Bank (http://www.rcsb.org/pdb/home/home.do). b-Sheets are generated by the pairing of two or more b-strands held together by characteristic patterns of hydrogen bonds running in a parallel or antiparallel fashion (Zhang and Kim, 2000). Cheng and Baldi (2005) pioneered the idea of predicting b-sheet topologies when the protein secondary structure is known and set the standard for this type of task Their method BetaPro is based on a 2D-recursive neural network (Baldi and Pollastri, 2003) trained to predict pairing probabilities of interstrand b-residue pairs. Some powerful methods based on the extraction of direct coupling information from MSAs have been introduced to predict protein contacts both in globular (Cocco et al, 2013; Ekeberg et al 2013; Jones et al, 2012; Marks et al, 2011; Morcos et al, 2001; Weigt et al 2009) and membrane proteins (Hopf et al, 2012; Nugent and Jones 2012).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.