Abstract

For the past 20 years, the recombination detection program (RDP) project has focused on the development of a fast, flexible, and easy to use Windows-based recombination analysis tool. Whereas previous versions of this tool have relied on considerable user-mediated verification of detected recombination events, the latest iteration, RDP5, is automated enough that it can be integrated within analysis pipelines and run without any user input. The main innovation enabling this degree of automation is the implementation of statistical tests to identify recombination signals that could be attributable to evolutionary processes other than recombination. The additional analysis time required for these tests has been offset by algorithmic improvements throughout the program such that, relative to RDP4, RDP5 will still run up to five times faster and be capable of analyzing alignments containing twice as many sequences (up to 5000) that are five times longer (up to 50 million sites). For users wanting to remove signals of recombination from their datasets before using them for downstream phylogenetics-based molecular evolution analyses, RDP5 can disassemble detected recombinant sequences into their constituent parts and output a variety of different recombination-free datasets in an array of different alignment formats. For users that are interested in exploring the recombination history of their datasets, all the manual verification, data management and data visualization components of RDP5 have been extensively updated to minimize the amount of time needed by users to individually verify and refine the program’s interpretation of each of the individual recombination events that it detects.

Highlights

  • Recombination and genome component reassortment are processes that strongly impact the evolution of many virus species

  • Whereas previous versions of this tool have relied on considerable user-mediated verification of detected recombination events, the latest iteration, RDP5, is automated enough that it can be integrated within analysis pipelines and run without any user input

  • For users that are interested in exploring the recombination history of their datasets, all the manual verification, data management and data visualization components of RDP5 have been extensively updated to minimize the amount of time needed by users to individually verify and refine the program’s interpretation of each of the individual recombination events that it detects

Read more

Summary

Introduction

Recombination and genome component reassortment are processes that strongly impact the evolution of many virus species. Successive versions of RDP have applied an expanding array of recombination event detection, recombination breakpoint demarcation, and recombinant sequence identification methods, all applied in unison, to yield detailed descriptions of how recombination may have impacted the evolution of any given set of aligned nucleotide sequences (Martin et al 2015). The accuracy of these descriptions, frequently depended on the amount of effort users were willing to put into exploring the many plausible ways in which the detected patterns of recombination may have arisen. A guiding principle during the development of RDP5, the latest version of the RDP series, has been a minimization of the amount of time that users need to invest in detecting and removing signals of recombination from nucleotide sequence datasets

Generation of recombination-free datasets
Query vs reference scans for recombination
Automated sequence annotation
Detection of potential false-positive recombination signals
RDP5CL: a command-line version of RDP5
Improved computational performance
Operational limits
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call