Abstract

Development of protein 3-D structural comparison methods is important in understanding protein functions. At the same time, developing such a method is very challenging. In the last 40 years, ever since the development of the first automated structural method, ~200 papers were published using different representations of structures. The existing methods can be divided into five categories: sequence-, distance-, secondary structure-, geometry-based, and network-based structural comparisons. Each has its uniqueness, but also limitations. We have developed a novel method where the 3-D structure of a protein is modeled using the concept of Triangular Spatial Relationship (TSR), where triangles are constructed with the Cα atoms of a protein as vertices. Every triangle is represented using an integer, which we denote as “key,” A key is computed using the length, angle, and vertex labels based on a rule-based formula, which ensures assignment of the same key to identical TSRs across proteins. A structure is thereby represented by a vector of integers. Our method is able to accurately quantify similarity of structure or substructure by matching numbers of identical keys between two proteins. The uniqueness of our method includes: (i) a unique way to represent structures to avoid performing structural superimposition; (ii) use of triangles to represent substructures as it is the simplest primitive to capture shape; (iii) complex structure comparison is achieved by matching integers corresponding to multiple TSRs. Every substructure of one protein is compared to every other substructure in a different protein. The method is used in the studies of proteases and kinases because they play essential roles in cell signaling, and a majority of these constitute drug targets. The new motifs or substructures we identified specifically for proteases and kinases provide a deeper insight into their structural relations. Furthermore, the method provides a unique way to study protein conformational changes. In addition, the results from CATH and SCOP data sets clearly demonstrate that our method can distinguish alpha helices from beta pleated sheets and vice versa. Our method has the potential to be developed into a powerful tool for efficient structure-BLAST search and comparison, just as BLAST is for sequence search and alignment.

Highlights

  • Availability of protein sequences and structures has been rapidly increasing

  • We learned that an equal width binning method will end up with a different number of triangles falling in each bin depending on whether the specified interval of values is for Theta or for MaxDist

  • To maximize the possibility of the same or similar number of triangles in each bin and to ensure that all occurrences of the same value are placed in the same bin, we used the Adaptive Unsupervised Iterative Discretization algorithm to calculate the bin boundaries (Liu et al, 2002; Witten et al, 2016)

Read more

Summary

Introduction

Availability of protein sequences and structures has been rapidly increasing. Both play fundamental roles in understanding protein functions. It is well-accepted that protein structures are more conserved than sequences. Understanding the 3-D structure, rather than pure 1-D relationships, provides deeper insights into protein functions. To accelerate discovery in all areas of biological and chemical sciences, efforts have been made in three directions: (1) constructing 3-D structure database, e.g., PDB (Berman et al, 2000) and structure-based protein classification database, e.g., CATH (Greene et al, 2007), FSSP (Holm and Sander, 1996), SCOP (Murzin et al, 1995); (2) developing computational methods, e.g., MD simulations, QM/MM calculations, beyond the resources of experimental data, for generating and optimizing theoretical structures; (3) developing algorithms for 3-D structural comparison or alignment. Efforts have been made in combining structure comparison algorithms with sequences analysis to achieve better understanding of sequence, structure, and function relationships

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call