Abstract

In the accompanying papers we have shown that sequence errors of public databases and confusion of paralogs and epaktologs (proteins that are related only through the independent acquisition of the same domain types) significantly distort the picture that emerges from comparison of the domain architecture (DA) of multidomain Metazoan proteins since they introduce a strong bias in favor of terminal over internal DA change. The issue of whether terminal or internal DA changes occur with greater probability has very important implications for the DA evolution of multidomain proteins since gene fusion can add domains only at terminal positions, whereas domain-shuffling is capable of inserting domains both at internal and terminal positions. As a corollary, overestimation of terminal DA changes may be misinterpreted as evidence for a dominant role of gene fusion in DA evolution. In this manuscript we show that in several recent studies of DA evolution of Metazoa the authors used databases that are significantly contaminated with incomplete, abnormal and mispredicted sequences (e.g., UniProtKB/TrEMBL, EnsEMBL) and/or the authors failed to separate paralogs and epaktologs, explaining why these studies concluded that the major mechanism for gains of new domains in metazoan proteins is gene fusion. In contrast with the latter conclusion, our studies on high quality orthologous and paralogous Swiss-Prot sequences confirm that shuffling of mobile domains had a major role in the evolution of multidomain proteins of Metazoa and especially those formed in early vertebrates.

Highlights

  • We have demonstrated that contamination of protein families with epaktologs increases the apparent rate of domain architecture (DA) change and introduces a strong bias in DA differences in as much as it increases the proportion of terminal over internal DA differences

  • In view of our observation that sequence errors and confusion of epaktologs with other types of homologs significantly distorts the evolutionary history of the DA of multidomain proteins, it is important to re-examine the conclusions of earlier studies that neglected the influence of these errors

  • There is a general consensus that the rate of formation of new DAs is significantly higher in Metazoa than in prokaryotes or other eukaryotes [25] so it is even more surprising that this increase in the rate of DA evolution is not reflected in a shift in favor of internal DA changes

Read more

Summary

Introduction

These authors have analyzed the whole SwissProt/TrEMBL set of proteins and concluded that DA changes occur most frequently at termini which in turn led the authors to conclude that “these results have further supported the emerging view that, by and large, the modular evolution of proteins is dominated by two major types of events: fusion, on the one hand, and deletion and fission on the other”. Buljan and Bateman [27] have studied domain architecture evolution in animal gene families represented in UniProt (Swiss-Prot plus TrEMBL) database and these authors have concluded that gain and loss of domains is preferred at protein termini.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call