Abstract

Genome sequencing projects unearth sequences of all the protein sequences encoded in a genome. As the first step, homology detection is employed to obtain clues to structure and function of these proteins. However, high evolutionary divergence between homologous proteins challenges our ability to detect distant relationships. In the past, an approach involving multiple Position Specific Scoring Matrices (PSSMs) was found to be more effective than traditional single PSSMs. Cascaded search is another successful approach where hits of a search are queried to detect more homologues. We propose a protocol, ‘Master Blaster’, which combines the principles adopted in these two approaches to enhance our ability to detect remote homologues even further. Assessment of the approach was performed using known relationships available in the SCOP70 database, and the results were compared against that of PSI-BLAST and HHblits, a hidden Markov model-based method. Compared to PSI-BLAST, Master Blaster resulted in 10% improvement with respect to detection of cross superfamily connections, nearly 35% improvement in cross family and more than 80% improvement in intra family connections. From the results it was observed that HHblits is more sensitive in detecting remote homologues compared to Master Blaster. However, there are true hits from 46-folds for which Master Blaster reported homologs that are not reported by HHblits even using the optimal parameters indicating that for detecting remote homologues, use of multiple methods employing a combination of different approaches can be more effective in detecting remote homologs. Master Blaster stand-alone code is available for download in the supplementary archive.

Highlights

  • IntroductionHomology detection is employed to obtain clues to structure and function of these proteins

  • As the first step, homology detection is employed to obtain clues to structure and function of these proteins

  • This step represents the cascade nature of the search process. Hits from these new set of searches are combined with the hits from previous generation and a new multiple sequence alignment and corresponding multiple Position Specific Scoring Matrices (PSSMs) are generated using PSI-BLAST

Read more

Summary

Introduction

Homology detection is employed to obtain clues to structure and function of these proteins. Assessment of the approach was performed using known relationships available in the SCOP70 database, and the results were compared against that of PSI-BLAST and HHblits, a hidden Markov model-based method. From the results it was observed that HHblits is more sensitive in detecting remote homologues compared to Master Blaster. Master Blaster reported homologs that are not reported by HHblits even using the optimal parameters indicating that for detecting remote homologues, use of multiple methods employing a combination of different approaches can be more effective in detecting remote homologs. While multiple experimental studies are required for a detailed and complete understanding of the molecular and mechanistic basis of protein action and regulation, computational approaches can help to arrive at reasonable initial ideas on the functions, structures and other features of p

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call