Abstract
Although cryo-electron microscopy (cryo-EM) has been successfully used to derive atomic structures for many proteins, it is still challenging to derive atomic structures when the resolution of cryo-EM density maps is in the medium resolution range, such as 5–10 Å. Detection of protein secondary structures, such as helices and β-sheets, from cryo-EM density maps provides constraints for deriving atomic structures from such maps. As more deep learning methodologies are being developed for solving various molecular problems, effective tools are needed for users to access them. We have developed an effective software bundle, DeepSSETracer, for the detection of protein secondary structure from cryo-EM component maps in medium resolution. The bundle contains the network architecture and a U-Net model trained with a curriculum and gradient of episodic memory (GEM). The bundle integrates the deep neural network with the visualization capacity provided in ChimeraX. Using a Linux server that is remotely accessed by Windows users, it takes about 6 s on one CPU and one GPU for the trained deep neural network to detect secondary structures in a cryo-EM component map containing 446 amino acids. A test using 28 chain components of cryo-EM maps shows overall residue-level F1 scores of 0.72 and 0.65 to detect helices and β-sheets, respectively. Although deep learning applications are built on software frameworks, such as PyTorch and Tensorflow, our pioneer work here shows that integration of deep learning applications with ChimeraX is a promising and effective approach. Our experiments show that the F1 score measured at the residue level is an effective evaluation of secondary structure detection for individual classes. The test using 28 cryo-EM component maps shows that DeepSSETracer detects β-sheets more accurately than Emap2sec+, with a weighted average residue-level F1 score of 0.65 and 0.42, respectively. It also shows that Emap2sec+ detects helices more accurately than DeepSSETracer with a weighted average residue-level F1 score of 0.77 and 0.72 respectively.
Highlights
Many atomic structures have been resolved from cryoEM density maps with a resolution of 4 Å or higher, deriving atomic structures from cryo-electron microcopy with medium resolution (5–10 Å) is challenging due to quality of density maps in this resolution range
We propose a tool, DeepSSETracer, for secondary structure detection from cryo-electron microscopy (cryo-EM) density component maps using a convolutional neural network
All cryo-EM density maps were downloaded from Electron Microscopy Data Bank (EMDB) with a requirement of resolution between 5–10 Å and a corresponding atomic structure available in Protein Data Bank (PDB)
Summary
Many atomic structures have been resolved from cryoEM density maps with a resolution of 4 Å or higher, deriving atomic structures from cryo-electron microcopy (cryo-EM) with medium resolution (5–10 Å) is challenging due to quality of density maps in this resolution range. To establish an initial trace of a backbone, a critical step is to map secondary structures of a protein sequence to their locations in the cryo-EM density map; this is a step referred to as finding the topology of secondary structures (Abeysinghe et al, 2008; Al Nasr et al, 2014; Biswas et al, 2016) Many methods, such as JPred and SSpro, are available to predict sequence segments of protein secondary structures (Cole et al, 2008; Magnan and Baldi, 2014). Since secondary structures, such as α-helices and β-sheets, have density characteristics, they are distinguishable in density maps at the medium resolution. Location of α-helices and β-sheets in a density map provides constraints about the atomic structure of the protein
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.