Abstract

Although experimental protein-structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or appear as a contaminant. Regardless of the source of the problem, the unknown protein always requires characterization. Here, an automated pipeline is presented for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. The method's application to characterize the crystal structure of an unknown protein purified from a snake venom is presented. It is also shown that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.

Highlights

  • Recent years have witnessed an unprecedented advancement of protein-structure-prediction approaches

  • We have presented a complete pipeline for the identification of unknown proteins in crystallography and cryo-EM

  • We have shown that our approach can successfully identify proteins, based on non-curated models automatically built into cryo-EM maps, at local resolutions up to 4.5 Awhere models are usually highly fragmented and prone to tracing errors

Read more

Summary

Introduction

Recent years have witnessed an unprecedented advancement of protein-structure-prediction approaches. There are macromolecular targets which are not yet amenable to in silico structure-prediction approaches, most notably including structurally heterogeneous large macromolecular complexes containing protein, RNA and small-molecule components. Interesting in this context are recent advances in cryo-EM that enabled detailed studies of macromolecular complexes in their natural cellular environment (Tegunov et al, 2021). The problem of unknown-protein identity is not unique to cryo-EM studies It is surprisingly common for macromolecular crystallographers to crystallize and solve previously uncharacterized protein structures. These can be proteins purified from natural sources (as described in this work) or contaminants, either native to an

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.