Abstract

In evolutionary biology, the study of phylogenetics can be considered as one of the main research disciplines. Phylogenetics is based on comparative data, which is mainly DNA sequences or raw sequencing reads. Alignment-based sequencing and alignment-free sequencing are the two main similarity computation methods, which are used to find genetic relatedness of different species. Alignment-based methods are relatively complex and computationally challenging as the genome scales when considering mammalian datasets and complex metagenomic colonies. Moreover, they show poor accuracy in certain cases in genetic comparison due to misalignments and algorithmic tolerances. Alignment-free comparison methods perform much better in genetic distance computation by addressing most of the challenges observed in alignment-based methods. In this paper, we propose a novel alignment-free, pairwise, distance calculation method based on k-mers. With this, we convert longer DNA sequences into simplified k-mer forest structures, which makes the comparison more convenient. Further, we are using a specialized tree pruning approach, which minimizes tree comparison time considerably compared to other alignment-free methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call