Abstract

In the field of computational biology, genome sequence comparison among different species is essential and has applications in both the research and scientific fields. Owing to the lengthy processing time and large number of data sets, the alignment-based approaches are unsuitable and ineffective. Therefore, alignment-free techniques have obtained popularity for acquiring proper sequence clustering and evolutionary relationship among species. In this paper, a complete bipartite graph based Positional difference and Frequency (PdF) vector descriptor is introduced. Positional difference and Frequency, two parameters, are applied to the genome sequence to create a 16- dimensional vector descriptor using the di-nucleotide representation of genome sequence. Subsequently, a distance matrix is calculated to construct the phylogenetic trees for different data sets of mammals and viruses. The achieved outcomes are compared with the phylogenetic trees of the earlier methods viz. the FFP method, the ClustalW method, the MEV method, the PCNV method and the FIS method. In most instances, the proposed method produces more precise outcomes than the preceding techniques and has potential for genome sequence comparison on both the equal and unequal length of data-sets. Communicated by Ramaswamy H. Sarma

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call