Abstract

As a very important research direction in the field of bioinformatics, sequence alignment plays a vital role in the research and development of biology. Converting genome sequence to graph by using frequency chaos game representation (FCGR) is an excellent gene sequence mapping technology, which can store rich genetic information into FCGR graphics. To each FCGR image, we construct its perceptual image hashing (PIH) matrix using the bicubic interpolation zooming. The difference of the perceptual hash matrix of each two images is calculated, and the clustering distance of the corresponding two gene sequences is represented by the differentials of the perceptual hash matrix. In this paper, we aligned and analyzed several typical genome sequence datasets including mammalian mitochondrial genes, human immunodeficiency virus 1 (HIV-1) and hepatitis E virus (HEV) to build their evolutionary trees. Experimental results showed that our PIH combining FCGR method (FCGR-PIH) has similar classification accuracy to the classical Clustal W sequence alignment method. Furthermore, 25 complete mitochondrial DNA sequences of cichlid fishes and 27 Escherichia coli/Shigella full genome sequences were selected from the AFproject test platform for tests. The performance benchmark rankings demonstrate the effectiveness of the FCGR-PIH algorithm and its potential for large-scale genome sequence analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call