Abstract

Modern sequencing technique has provided a wealth of data on DNA sequences, which has made the analysis and comparison of sequences a very important but difficult task. In this paper, by regarding the dinucleotide as a 2-combination of the multiset { ∞ · A , ∞ · G , ∞ · C , ∞ · T } , a novel 3-D graphical representation of a DNA sequence is proposed, and its projections on planes (x,y), (y,z) and (x,z) are also discussed. In addition, based on the idea of “piecewise function”, a cell-based descriptor vector is constructed to numerically characterize the DNA sequence. The utility of our approach is illustrated by the examination of phylogenetic analysis on four datasets.

Highlights

  • The rapid development of DNA sequencing techniques has resulted in explosive growth in the number of DNA primary sequences, and the analysis and comparison of biological sequences has become a topic of considerable interest in Computational Biology and Bioinformatics

  • In this paper, based on all of the 2-combinations of the multiset t8 ̈ A, 8 ̈ G, 8 ̈ C, 8 ̈ Tu, we propose a novel graphical representation of DNA sequences

  • “piecewise function”, we describe a particular scheme that transforms the graphical representation of DNA into a cell-based descriptor vector

Read more

Summary

Introduction

The rapid development of DNA sequencing techniques has resulted in explosive growth in the number of DNA primary sequences, and the analysis and comparison of biological sequences has become a topic of considerable interest in Computational Biology and Bioinformatics. The traditional measure for similarity analysis of DNA sequences is based on multiple sequence alignment, which uses dynamic programming techniques to identify the globally optimal alignment solution. A lot of alignment-free approaches for sequence comparison have been proposed. The basic idea behind most alignment-free methods is to characterize DNA by certain mathematical models derived for DNA sequence, rather than by a direct comparison of DNA sequences themselves. By assigning four directions defined by the positive/negative x and y coordinate axes to the four nucleic acid bases, Gates [3], Nandy [4,5], and Leong and Morgenthaler [6] introduced three different 2-D graphical representations, respectively

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call