Abstract

Amino acid repeats play an important role in the structure and function of proteins. Analysis of long repeats in protein sequences enables one to understand their abundance, structure and function in the protein universe. In the present study, amino acid repeats of length >50 (long repeats) were identified in a non-redundant set of UniProt sequences using the RADAR program. The underlying structures and functions of these long repeats were carried out using the Gene3D for structural domains, Pfam for functional domains and enzyme and non-enzyme functional classification for catalytic and binding of the proteins. From a structural perspective, these long repeats seem to predominantly occur in certain architectures such as sandwich, bundle, barrel, and roll and within these architectures abundant in the superfolds. The lengths of the repeats within each fold are not uniform exhibiting different structures for different functions. We also observed that long repeats are in the domain regions of the family and are involved in the function of the proteins. After grouping based on enzyme and non-enzyme classes, we observed the abundant occurrence of long repeats in specific catalytic and binding of the proteins. In this study, we have analyzed the occurrence of long repeats in the protein sequence universe apart from well-characterized short tandem repeats in sequences and their structures and functions of the proteins at the domain level. The present study suggests that long repeats may play an important role in the structure and function of domains of the proteins.

Highlights

  • Amino acid repeats are ubiquitous in protein sequences that often correspond to structural and functional units of proteins

  • The present analysis shows that the occurrence of long repeats and their structures and functions of the proteins at the domain level

  • The present large scale study has focused on the presence of long repeats in a non-redundant set of the entire annotated UniProtKB/Swiss-Prot database and reveals that long repeats are found in 23% of the proteins

Read more

Summary

Introduction

Amino acid repeats are ubiquitous in protein sequences that often correspond to structural and functional units of proteins The length of these repeats varies considerably from shorter units of homo repeats of single amino acid (Jorda and Kajava, 2010), oligopeptide repeats of 2–20 residues (Fraser and MacRae, 1973) and solenoid repeats of 20–40 residues to larger repetitions of length >50 called domain repeats (Andrade et al, 2001). These repeats occur as a single pair or as multiple copies in a tandem/non-tandem manner that are useful for structural packing or for one or more interactions with ligand (Katti et al, 2000; Luo and Nijveen, 2014). It has been found out that long stretches of perfect repetitions are infrequent in protein

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call