Abstract

Comparative genomics has proven a fruitful approach to acquire many functional and evolutionary insights into core cellular processes. Here it is argued that in order to perform accurate and interesting comparative genomics, one first and foremost has to be able to recognize, postulate, and revise different evolutionary scenarios. After all, these studies lack a simple protocol, due to different proteins having different evolutionary dynamics and demanding different approaches. The authors here discuss this challenge from a practical (what are the observations?) and conceptual (how do these indicate a specific evolutionary scenario?) viewpoint, with the aim to guide investigators who want to analyze the evolution of their protein(s) of interest. By sharing how the authors draft, test, and update such a scenario and how it directs their investigations, the authors hope to illuminate how to execute molecular evolution studies and how to interpret them. Also see the video abstract here https://youtu.be/VCt3l2pbdbQ.

Highlights

  • Comparative genomics has proven a fruitful approach to acquire many functional and evolutionary insights into core cellular processes

  • We argue that the most crucial yet overlooked skill is to be able to recognize, postulate, and revise different evolutionary scenarios, a process that is typically difficult to automate

  • We assume we search for orthologs in other eukaryotes and aim to infer the evolutionary events that happened since last eukaryotic common ancestor (LECA)

Read more

Summary

Introduction

Comparative genomics has proven a fruitful approach to acquire many functional and evolutionary insights into core cellular processes. In collaboration with molecular biologists, we studied diverse eukaryotic proteins, varying from chromatin remodelers and bZIP transcription factors to flagellum and kinetochore components.[2,33,34,35] We experienced that for non‐ bioinformaticians various challenges arise regarding the use of tools, the knowledge of genome evolution and of the species tree, and, importantly, how complicated evolutionary histories are reflected in a myriad of computational tools and sequence databases Such challenges may cause flawed evolutionary inferences, which in turn may cause incorrect, or incomplete, function prediction.[30] We here detail principles that explain “best practices” from the field, while we draw attention to what we consider neglected. We argue that the most crucial yet overlooked skill is to be able to recognize, postulate, and revise different evolutionary scenarios, a process that is typically difficult to automate

Inferring the Evolution of a Protein Means to Search for Its Orthologs
Collecting Observations to Generate an Initial Scenario
OG protein Y Z
Making and Using Gene Trees under Different Evolutionary Scenario
Challenging the Evolutionary Scenario
Multi‐Domain Proteins May Require Multiple Independent Analyses
When and How to Consider Gray Zone Hits
Compositionally Biased Proteins May Evolve Convergently
How Gene Trees May Be Incorrect
Mixed Scenarios Face Multiple Challenges
When Did My Protein Originate?
Conclusions
Conflict of Interest
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.