Abstract
Accurate molecular structure of the protein dimer representing the elementary building block of intermediate filaments (IFs) is essential towards the understanding of the filament assembly, rationalizing their mechanical properties and explaining the effect of disease-related IF mutations. The dimer contains a ∼300-residue long α-helical coiled coil which cannot be assessed by either direct experimental structure determination or modelling using standard approaches. At the same time, coiled coils are well-represented in structural databases. Here we present CCFold, a generally applicable threading-based algorithm which produces coiled-coil models from protein sequence only. The algorithm is based on a statistical analysis of experimentally determined structures and can handle any hydrophobic repeat patterns in addition to the most common heptads. We demonstrate that CCFold outperforms general-purpose computational folding in terms of accuracy, while being faster by orders of magnitude. By combining the CCFold algorithm and Rosetta folding we generate representative dimer models for all IF protein classes. The source code is freely available at https://github.com/biocryst/IF; a web server to run the program is at http://pharm.kuleuven.be/Biocrystallography/cc. Supplementary data are available at Bioinformatics online.
Highlights
Intermediate filaments (IFs) are an important example of a protein assembly based on α-helical coiled coils (CCs)
Following our divide-and-conquer strategy (Strelkov et al, 2001) multiple short fragments of the intermediate filaments (IFs) rod could be resolved at atomic detail using X-ray crystallography (Guzenko et al, 2017), while a full-length IF dimer is clearly unsuitable for crystallisation (Chernyatina et al, 2016)
We have found that such an approach yields more accurate models compared to just using a fixed set of overlapping fragments with constant overlap, especially in the cases when non-canonical repeats are present or the hydrophobic core assignment is ambiguous
Summary
Intermediate filaments (IFs) are an important example of a protein assembly based on α-helical coiled coils (CCs). This pattern, with residue positions traditionally labelled abcdefg, results in a ’canonical’ left-handed CC Another possibility is a hendecad (11-residue repeat) HxxHxxxHxxx, which corresponds to an addition of a four-residue block (stutter) to a heptad (Lupas and Gruber, 2005). Experimental data reveal that such transitions can often be accommodated within a continuously α-helical structure, causing adaptation of the local CC parameters, so that the hydrophobic core packing is preserved (Strelkov and Burkhard, 2002) For such less regular CC structures, accurate modelling remains a challenging task. We describe a novel threading-based algorithm, CCFold, designed for the prediction of CC structure It is based on picking multiple CC fragments for short overlapping segments of the input sequence. The availability of X-ray structures of multiple IF rod fragments allows us to further evaluate the performance of the algorithm
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have