Abstract

A novel 3-D graphical representation of protein sequence has been introduced. A right cone of a unit base and unit height has been selected to represent protein sequences on its surface. The twenty amino acids have been represented by 20 circles and all protein's residues have been represented by n lines on the cone's surface. All the spots which represent the protein's residues have been shown in the cone's top view. The spatial median of all the spots is used as a new descriptor of any protein sequence. This approach was applied on two short segments of protein of yeast Saccharomyces cerevisiae. The examination of the similarities/dissimilarities for the eight ND5 proteins and the six β-globin proteins illustrate the utility of our approach. A linear correlation and significance analysis have been provided to compare our results and the percentage sequence alignment identity.

Highlights

  • There is a huge gap between the growth of protein sequence and the structure databases

  • Many mathematical approaches were proposed to translate protein sequences from letters to 2D or 3D graphical representations accompanied by mathematical objects such as vectors or matrices to use them as sequence descriptors and compare these mathematical objects

  • We have proposed cone’s top view in order to obtain a good visualization of our 3-D graphical representation

Read more

Summary

INTRODUCTION

There is a huge gap between the growth of protein sequence and the structure databases. LSR represents any protein sequence by letters corresponding to the 20 amino acids. Graphical representation of protein sequences may be depending on selecting a geometrical object to represent residues or assigning vectors to residues. An example of assigning vectors to residues in 3D was introduced by selecting three physicochemical properties of amino acids side chains which are hydropathy index, amino acid side chain charge, and mean accessible surface area (ASA) of side chains [15] Another 3D graphical representation of proteins based on five-letter model of amino acids which converts the twenty letters of amino acids to only five letters [13]. A right cone with a unit base radius and unit height has been chosen to represent any protein sequence on its surface. By substituting in Eq., our proposed approach is expressed as follows:

19 Tyrosine
RESULTS AND DISCUSSION
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.