Abstract

Accurate molecular structure of the protein dimer representing the elementary building block of intermediate filaments (IFs) is essential towards the understanding of the filament assembly, rationalizing their mechanical properties and explaining the effect of disease-related IF mutations. The dimer contains a ∼300-residue long α-helical coiled coil which cannot be assessed by either direct experimental structure determination or modelling using standard approaches. At the same time, coiled coils are well-represented in structural databases. Here we present CCFold, a generally applicable threading-based algorithm which produces coiled-coil models from protein sequence only. The algorithm is based on a statistical analysis of experimentally determined structures and can handle any hydrophobic repeat patterns in addition to the most common heptads. We demonstrate that CCFold outperforms general-purpose computational folding in terms of accuracy, while being faster by orders of magnitude. By combining the CCFold algorithm and Rosetta folding we generate representative dimer models for all IF protein classes. The source code is freely available at https://github.com/biocryst/IF; a web server to run the program is at http://pharm.kuleuven.be/Biocrystallography/cc. Supplementary data are available at Bioinformatics online.

Highlights

  • Intermediate filaments (IFs) are an important example of a protein assembly based on α-helical coiled coils (CCs)

  • Following our divide-and-conquer strategy (Strelkov et al, 2001) multiple short fragments of the intermediate filaments (IFs) rod could be resolved at atomic detail using X-ray crystallography (Guzenko et al, 2017), while a full-length IF dimer is clearly unsuitable for crystallisation (Chernyatina et al, 2016)

  • We have found that such an approach yields more accurate models compared to just using a fixed set of overlapping fragments with constant overlap, especially in the cases when non-canonical repeats are present or the hydrophobic core assignment is ambiguous

Read more

Summary

Introduction

Intermediate filaments (IFs) are an important example of a protein assembly based on α-helical coiled coils (CCs). This pattern, with residue positions traditionally labelled abcdefg, results in a ’canonical’ left-handed CC Another possibility is a hendecad (11-residue repeat) HxxHxxxHxxx, which corresponds to an addition of a four-residue block (stutter) to a heptad (Lupas and Gruber, 2005). Experimental data reveal that such transitions can often be accommodated within a continuously α-helical structure, causing adaptation of the local CC parameters, so that the hydrophobic core packing is preserved (Strelkov and Burkhard, 2002) For such less regular CC structures, accurate modelling remains a challenging task. We describe a novel threading-based algorithm, CCFold, designed for the prediction of CC structure It is based on picking multiple CC fragments for short overlapping segments of the input sequence. The availability of X-ray structures of multiple IF rod fragments allows us to further evaluate the performance of the algorithm

CC fragment library
Delineation of CC domains
The CCFold algorithm
Sequence segmentation
Optimal set of overlapping structural fragments
Output model generation
Implementation
Modelling of the IF rod domain
Linkers L1 and L12
Assembly of the complete rod domain
CCFold algorithm validation
IF dimer structure
Application to molecular replacement
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call