Abstract

BackgroundGenome wide and cross species comparisons of amino acid repeats is an intriguing problem in biology mainly due to the highly polymorphic nature and diverse functions of amino acid repeats. Innate protein repeats constitute vital functional and structural regions in proteins. Repeats are of great consequence in evolution of proteins, as evident from analysis of repeats in different organisms. In the post genomic era, availability of protein sequences encoded in different genomes provides a unique opportunity to perform large scale comparative studies of amino acid repeats. ProtRepeatsDB is a relational database of perfect and mismatch repeats, access to which is designed as a resource and collection of tools for detection and cross species comparisons of different types of amino acid repeats.DescriptionProtRepeatsDB (v1.2) consists of perfect as well as mismatch amino acid repeats in the protein sequences of 141 organisms, the genomes of which are now available. The web interface of ProtRepeatsDB consists of different tools to perform repeat s; based on protein IDs, organism name, repeat sequences, and keywords as in FASTA headers, size, frequency, gene ontology (GO) annotation IDs and regular expressions (REGEXP) describing repeats. These tools also allow formulation of a variety of simple, complex and logical queries to facilitate mining and large-scale cross-species comparisons of amino acid repeats. In addition to this, the database also contains sequence analysis tools to determine repeats in user input sequences.ConclusionProtRepeatsDB is a multi-organism database of different types of amino acid repeats present in proteins. It integrates useful tools to perform genome wide queries for rapid screening and identification of amino acid repeats and facilitates comparative and evolutionary studies of the repeats. The database is useful for identification of species or organism specific repeat markers, interspecies variations and polymorphism.

Highlights

  • Genome wide and cross species comparisons of amino acid repeats is an intriguing problem in biology mainly due to the highly polymorphic nature and diverse functions of amino acid repeats

  • Links are provided to retrieve the sequence in FASTA format, perform repeat analysis using PROSPERO [25], DOTMATCHER [27], PfScan and BLAST [28] against ProtRepeatsDB

  • Using the tools to search PROSITE repeats section, we investigated bacterial proteins with tetratricopeptide repeat (TPR), a structural repeat motif present in a wide range of proteins [34,35] and believed to be involved in protein-protein interactions and assembly of multi protein complexes [36,37]

Read more

Summary

Conclusion

ProtRepeatsDB is a multi-organism database of protein repeats, which is the first database of its kind that incorporates different kinds of repeats viz. perfect repeatshomopeptides and heteropeptides, mismatch repeats and profile patterns representing different families of repeats. The current version (v 1.2) consists of 120686 perfect repeats, 834621 mismatch repeats and 3673 profile repeats from 894890 protein sequences belonging to 141 genomes. The web interface of ProtRepeatsDB consists of unique tools which allow formulation of queries for retrieval and cross species comparison of repeats. The web interface of ProtRepeatsDB is supported with PERL and PHP scripts which enable formulation of queries against the database. MKK, SD, GR and DG developed the MySQL database, Web interface and related PHP scripts. ProtRepeatsDB will be regularly updated with protein repeat sequences in emerging annotated sequences from various genome sequencing projects. ProtRepeatsDB will be developed further to include cross links with other databases, repeats detected by other repeat finding algorithms, 3-dimensional structures of repeat proteins, web based repeat finding servers, tools for phylogenetic analysis and ortholog based search for comparative analysis of repeats

Utility and discussion
Findings
25. Mott R
46. Williamson MP
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.