Abstract

BackgroundThe study of protein-small molecule interactions is vital for understanding protein function and for practical applications in drug discovery. To benefit from the rapidly increasing structural data, it is essential to improve the tools that enable large scale binding site prediction with greater emphasis on their biological validity.ResultsWe have developed a new method for the annotation of protein-small molecule binding sites, using inference by homology, which allows us to extend annotation onto protein sequences without experimental data available. To ensure biological relevance of binding sites, our method clusters similar binding sites found in homologous protein structures based on their sequence and structure conservation. Binding sites which appear evolutionarily conserved among non-redundant sets of homologous proteins are given higher priority. After binding sites are clustered, position specific score matrices (PSSMs) are constructed from the corresponding binding site alignments. Together with other measures, the PSSMs are subsequently used to rank binding sites to assess how well they match the query and to better gauge their biological relevance. The method also facilitates a succinct and informative representation of observed and inferred binding sites from homologs with known three-dimensional structures, thereby providing the means to analyze conservation and diversity of binding modes. Furthermore, the chemical properties of small molecules bound to the inferred binding sites can be used as a starting point in small molecule virtual screening. The method was validated by comparison to other binding site prediction methods and to a collection of manually curated binding site annotations. We show that our method achieves a sensitivity of 72% at predicting biologically relevant binding sites and can accurately discriminate those sites that bind biological small molecules from non-biological ones.ConclusionsA new algorithm has been developed to predict binding sites with high accuracy in terms of their biological validity. It also provides a common platform for function prediction, knowledge-based docking and for small molecule virtual screening. The method can be applied even for a query sequence without structure. The method is available at http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi.

Highlights

  • The study of protein-small molecule interactions is vital for understanding protein function and for practical applications in drug discovery

  • We have developed a new database and method called "IBIS" (Inferred Biomolecular Interaction Server [53], http://www.ncbi.nlm.nih.gov/Structure/ibis/ ibis.html) which enables researchers to conveniently study biomolecular interactions that have been observed in protein structures and through inference by homology to formulate predictions/hypotheses for biomolecular interactions, even if the data for specific biomolecules is not available

  • Overall we found that a total of 31 proteins (70%) from the test set have at least one of their IBIS predicted sites overlap with Conserved Domain Database (CDD) binding site annotation

Read more

Summary

Introduction

The study of protein-small molecule interactions is vital for understanding protein function and for practical applications in drug discovery. The physical interactions between proteins and other molecules in protein crystal structures provide crucial insights into protein function. It is precisely these structures that enable researchers to study interactions in atomic detail, and find out, for example, how a specific mutation in a protein affects its function, or how a few atom modifications in a small molecule might lead to a more effective drug. Protein Data Bank), it is of great importance to improve the tools available for study of these interactions. It is possible to infer protein-small molecule interactions even if there are no crystal structures available for a particular protein of interest, as long as there are structures of sufficiently close homologs. Recent estimates suggest that the majority of Entrez Protein sequences have homologs with a known structure [1,2], thereby providing a reasonable chance to find relevant interactions via structures for protein sequences

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.