Abstract

Cryo-electron microscopy of protein complexes often leads to moderate resolution maps (4-8 Å), with visible secondary-structure elements but poorly resolved loops, making model building challenging. In the absence of high-resolution structures of homologues, only coarse-grained structural features are typically inferred from these maps, and it is often impossible to assign specific regions of density to individual protein subunits. This paper describes a new method for overcoming these difficulties that integrates predicted residue distance distributions from a deep-learned convolutional neural network, computational protein folding using Rosetta, and automated EM-map-guided complex assembly. We apply this method to a 4.6 Å resolution cryoEM map of Fanconi Anemia core complex (FAcc), an E3 ubiquitin ligase required for DNA interstrand crosslink repair, which was previously challenging to interpret as it comprises 6557 residues, only 1897 of which are covered by homology models. In the published model built from this map, only 387 residues could be assigned to the specific subunits with confidence. By building and placing into density 42 deep-learning-guided models containing 4795 residues not included in the previously published structure, we are able to determine an almost-complete atomic model of FAcc, in which 5182 of the 6557 residues were placed. The resulting model is consistent with previously published biochemical data, and facilitates interpretation of disease-related mutational data. We anticipate that our approach will be broadly useful for cryoEM structure determination of large complexes containing many subunits for which there are no homologues of known structure.

Highlights

  • With the advent of direct electron detectors and advances in image-processing software, there has been an influx of large protein complex structures determined by cryoelectron microscopy

  • While co-evolution information can provide valuable structural information (Kim et al, 2014; Nugent & Jones, 2012; Ovchinnikov et al, 2014), the limited availability of large numbers of sequences restricts the applicability of the method, it has been used in the interpretation of some cryoelectron microscopy (cryoEM) structures (Klink et al, 2020; Park, Lacourse et al, 2018; Schoebel et al, 2017)

  • We take all docked results in addition to cross-linking mass-spectrometry (XLMS) data, and using Monte Carlo sampling of domain assignments in density we find the arrangement of domain models that maximizes the agreement with the cryoEM and XL-MS data

Read more

Summary

Introduction

With the advent of direct electron detectors and advances in image-processing software, there has been an influx of large protein complex structures determined by cryoelectron microscopy (cryoEM). CryoEM data are noisy and structure determination requires a large number of particle images to be averaged together This averaging, when combined with complications such as image misclassification, highly heterogeneous samples or a limited number of sample views, typically limits the resolutions that can be attained (Lyumkis, 2019). While co-evolution information can provide valuable structural information (Kim et al, 2014; Nugent & Jones, 2012; Ovchinnikov et al, 2014), the limited availability of large numbers of sequences restricts the applicability of the method, it has been used in the interpretation of some cryoEM structures (Klink et al, 2020; Park, Lacourse et al, 2018; Schoebel et al, 2017)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call