Abstract
BackgroundProtein structures are better conserved than protein sequences, and consequently more functional information is available in structures than in sequences. However, proteins generally interact with other proteins and molecules via their surface regions and a backbone-only analysis of protein structures may miss many of the functional and evolutionary features. Surface information can help better elucidate proteins' functions and their interactions with other proteins. Computational analysis and comparison of protein surfaces is an important challenge to overcome to enable efficient and accurate functional characterization of proteins.MethodsIn this study we present a new method for representation and comparison of protein surface features. Our method is based on mapping the 3-D protein surfaces onto 2-D maps using various dimension reduction methods. We have proposed area and neighbor based metrics in order to evaluate the accuracy of this surface representation. In order to capture functionally relevant information, we encode geometric and biochemical features of the protein, such as hydrophobicity, electrostatic potential, and curvature, into separate color channels in the 2-D map. The resulting images can then be compared using efficient 2-D image registration methods to identify surface regions and features shared by proteins.ResultsWe demonstrate the utility of our method and characterize its performance using both synthetic and real data. Among the dimension reduction methods investigated, SNE, LandmarkIsomap, Isomap, and Sammon's mapping provide the best performance in preserving the area and neighborhood properties of the original 3-D surface. The enriched 2-D representation is shown to be useful in characterizing the functional site of chymotrypsin and able to detect structural similarities in heat shock proteins. A texture mapping using the 2-D representation is also proposed as an interesting application to structure visualization.
Highlights
Protein structures are better conserved than protein sequences, and more functional information is available in structures than in sequences
Protein function is largely dependent on surface features, especially the functional sites
Surface features are reducible to protein structure, and to sequence information, but convergent evolution has produced proteins with dissimilar sequences and/or structures which have similar surface properties and functions
Summary
Protein structures are better conserved than protein sequences, and more functional information is available in structures than in sequences. Surface information can help better elucidate proteins’ functions and their interactions with other proteins. The advent of new technologies has resulted in a massive expansion of the protein sequence and structure databases. This enables the characterization of the similarities of sequences and structures and identification of the location of functional sites. High throughput sequencing data analysis has opened up new applications and facilitated the study of proteins. The alignment of protein sequences and structures has been able to investigate convergent and divergent protein relationships; this has been facilitated by the exponentially increasing size of the available data. Heuristic approaches have been proposed to speed up the alignment against large sequence databases [3,4]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.