Abstract
An extant genome can be the descendant of an ancient polyploid genome. The genome aliquoting problem is to reconstruct the latter from the former such that the rearrangement distance (i.e., the number of genome rearrangements necessary to transform the former into the latter) is minimal. Though several heuristic algorithms have been published, here, we sought improved algorithms for the problem with respect to the double cut and join (DCJ) distance. The new algorithm makes use of partial and contracted partial graphs, and locally minimizes the distance. Our test results with simulation data indicate that it reliably recovers gene order of the ancestral polyploid genome even when the ancestor is ancient. We also compared the performance of our method with an earlier method using simulation data sets and found that our algorithm has higher accuracy. It is known that vertebrates had undergone two rounds of whole-genome duplication (2R-WGD) during early vertebrate evolution. We used the new algorithm to calculate the DCJ distance between three modern vertebrate genomes and their 2R-WGD ancestor and found that the rearrangement rate might have slowed down significantly since the 2R-WGD. The software AliquotG implementing the algorithm is available as an open-source package from our website (http://mosas.sysu.edu.cn/genome/download_softwares.php).
Highlights
Whole genome sequencing projects permit easy and accurate detection of genome rearrangement events by direct comparison of two genome sequences
Sankoff proposed the use of the edit distance in 1992, which is defined as the minimum number of rearrangement events necessary to transform one genome into another
double cut and join (DCJ) differs from other edit distances in that it includes chromosomal fusion, fission, inversion, translocation and block interchange within a single model and allows simpler algorithms for calculation
Summary
Whole genome sequencing projects permit easy and accurate detection of genome rearrangement events by direct comparison of two genome sequences. To measure these events, Sankoff proposed the use of the edit distance in 1992, which is defined as the minimum number of rearrangement events necessary to transform one genome into another. Pevzner et al introduced reversal distance and developed a breakpoint graph–based, linear– time exact algorithm for computation [2,3,4,5]. DCJ differs from other edit distances in that it includes chromosomal fusion, fission, inversion, translocation and block interchange within a single model and allows simpler algorithms for calculation
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.