GISA: using Gauss Integrals to identify rare conformations in protein structures.

Christian Grønbæk,Thomas Hamelryck,Peter Røgen

doi:10.7717/peerj.9159

Christian Grønbæk, Thomas Hamelryck + Show 1 more

Open Access

https://doi.org/10.7717/peerj.9159

Copy DOI

Journal: PeerJ	Publication Date: Jun 11, 2020
Citations: 10	License type: CC BY 4.0

Affiliation: University of Copenhagen, Technical University of Denmark

Abstract

The native structure of a protein is important for its function, and therefore methods for exploring protein structures have attracted much research. However, rather few methods are sensitive to topologic-geometric features, the examples being knots, slipknots, lassos, links, and pokes, and with each method aimed only for a specific set of such configurations. We here propose a general method which transforms a structure into a ”fingerprint of topological-geometric values” consisting in a series of real-valued descriptors from mathematical Knot Theory. The extent to which a structure contains unusual configurations can then be judged from this fingerprint. The method is not confined to a particular pre-defined topology or geometry (like a knot or a poke), and so, unlike existing methods, it is general. To achieve this our new algorithm, GISA, as a key novelty produces the descriptors, so called Gauss integrals, not only for the full chains of a protein but for all its sub-chains. This allows fingerprinting on any scale from local to global. The Gauss integrals are known to be effective descriptors of global protein folds. Applying GISA to sets of several thousand high resolution structures, we first show how the most basic Gauss integral, the writhe, enables swift identification of pre-defined geometries such as pokes and links. We then apply GISA with no restrictions on geometry, to show how it allows identifying rare conformations by finding rare invariant values only. In this unrestricted search, pokes and links are still found, but also knotted conformations, as well as more highly entangled configurations not previously described. Thus, an application of the basic scan method in GISA’s tool-box revealed 10 known cases of knots as the top positive writhe cases, while placing at the top of the negative writhe 14 cases in cis-trans isomerases sharing a spatial motif of little secondary structure content, which possibly has gone unnoticed. Possible general applications of GISA are fold classification and structural alignment based on local Gauss integrals. Others include finding errors in protein models and identifying unusual conformations that might be important for protein folding and function. By its broad potential, we believe that GISA will be of general benefit to the structural bioinformatics community. GISA is coded in C and comes as a command line tool. Source and compiled code for GISA plus read-me and examples are publicly available at GitHub (https://github.com).

Highlights

Røgen & Bohr (2003) introduced a set of quantitative protein fold descriptors consisting in 29 knot-theoretic Gauss Integral (GI) based invariants, shown shortly after in Røgen & Fain (2003) to be able to automatically recover the classification of the CATH database.Automated local scrutiny of folds is desired for various purposes, including identifying odd shapes in predictions, improving classification and for structure alignments
The primary focus is on the proof-of-concept, which only involves the lowest order GI, viz. the writhe: First we show in a ‘‘restricted search’’ how GISA can be exploited to provide an algorithm for identifying particular geometries in folds such as a chain forming an almost closed loop through which it passes, or two such loops interlinking (these were termed ‘‘pokes’’ and ‘‘co-pokes’’, respectively, by Khatib, Rohl & Karplus (2009))
We have shown that with the help of GISA it is possible to find cases of rare geometries in proteins, such as those studied in Khatib, Rohl & Karplus (2009) and knots as identified with KnotProt (Dabrowski-Tumanski et al, 2018; Jamroz et al, 2015; Sulkowska et al, 2012)

Summary

Introduction

Røgen & Bohr (2003) introduced a set of quantitative protein fold descriptors consisting in 29 knot-theoretic Gauss Integral (GI) based invariants, shown shortly after in Røgen & Fain (2003) to be able to automatically recover the classification of the CATH database.Automated local scrutiny of folds is desired for various purposes, including identifying odd shapes in predictions, improving classification and for structure alignments. While the GI invariants work very well as global fold descriptors, an efficient method for computing them locally has been lacking and local applications have been few. Due to its recursive nature, our new algorithm, GISA, computes the GI invariants of an entire chain but at the same time of all sub-chains, allowing structural analyses on any scale from local to global (we assume chains and sub-chains to be connected). By the knot-theoretic nature of the GIs, GISA is sensitive to topologic-geometric differences, while having the fundamental translational-rotational invariance. This distinguishes GISA from distance based approaches. A general method for structural analysis having such topologic-geometric sensitivity seems still to be lacking (Jarmolińska et al, 2018; Marks et al, 2011). We believe that GISA can fill this gap

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

GISA: using Gauss Integrals to identify rare conformations in protein structures.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ

Lead the way for us

Similar Papers

Shape analysis for automated sulcal classification and parcellation of MRI data
Monica K Hurdal ... Deborah A Smith
Journal of Combinatorial Optimization | VOL. 15
Monica K Hurdal, et. al.Monica K Hurdal ... Deborah A Smith
15 Aug 2007
Journal of Combinatorial Optimization | VOL. 15

Performance Analysis of Non-uniform Sparse Segmentation Integral Method Based on Gauss Integral in EM Forward of Electrical Antenna Under Stratified Ocean
Zongyang Shi ... Yiyu Zhao
Science Discovery | VOL. 9
Zongyang Shi, et. al.Zongyang Shi ... Yiyu Zhao
01 Jan 2020
Science Discovery | VOL. 9

Fast large-scale clustering of protein structures using Gauss integrals
Tim Harder ... Wouter Boomsma
Bioinformatics | VOL. 28
Tim Harder, et. al.Tim Harder ... Wouter Boomsma
22 Dec 2011
Bioinformatics | VOL. 28

Protein folding prediction
Binguang Ma
Chinese Science Bulletin | VOL. 61
Binguang MaBinguang Ma
01 Aug 2016
Chinese Science Bulletin | VOL. 61

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GISA: using Gauss Integrals to identify rare conformations in protein structures.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ