Abstract

In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.

Highlights

  • Proteins are an essential part of organisms and participate in every process within the cell

  • The sequence of amino acids in a protein is defined by the genetic code and is part of a set of 20 standard residues most commonly found in living creatures

  • In an effort to contribute to advances in solving this problem, we propose a novel methodology to classify a huge dataset of proteins into protein families using conserved characteristics in structures of known function

Read more

Summary

Introduction

Proteins are an essential part of organisms and participate in every process within the cell They catalyze biochemical reactions which are vital to metabolism, have structural and mechanical functions, play a crucial role in cell signaling and adhesion, and they are involved in immune responses (Nelson and Cox, 2005; Branden and Tooze, 1999). These macromolecules are organic compounds made of amino acids arranged in linear chains and joined together by peptide bonds between carboxyl and amino groups of adjacent amino acid residues. A number of genes responsible for diseases have been identified but their specific functions are unknown

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call