Abstract

The DHPC (DNA Hilbert–Peano curve) is a new tool for visualizing large-scale genome sequences by mapping sequences into a two-dimensional square. It utilizes the space-filling function of Hilbert–Peano mapping. By applying a Gauss smoothing technique and a user-defined color function, a large-scale genome sequence can be mapped into a two-dimensional color image. In the calculated DHPCs, many genome characteristics are revealed. In this article we introduce the method and show how DHPCs may be used to identify regions of different base composition. The power of the method is demonstrated by presenting multiple examples such as repeating sequences, degree of base bias, regions of homogeneity and their boundaries, and mark of annotated segments. We also present several genome curves generated by DHPC to demonstrate how DHPC can be used to find previously unidentified sequence features in these genomes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call