Abstract

In this paper, we have developed an intelligent system for searching comparative genomic sequences which departs from the traditional sequence alignment methods of nucleic residues or alphabets. Instead, we use the composition vector method that exploits pattern structures in sequences and indexing techniques for building a genomic database of prokaryotic organisms and their phylogenetic relationships. For the structural analysis of prokaryotic patterns, we use this composition vector to express various fuzzy sequence pattern queries on genomic data that would be difficult to represent in traditional database technology. B.L. Hao and his group have used the composition vector method to construct a phylogenetic tree of prokaryotes to understand the evolutionary history of prokaryotic organisms. The composition vector method is based on counting the frequency of nucleotides of a fixed length K in the collection of gene sequences of each species. This method transforms variable length sequences to a fixed length vector. In addition to elaborating on the composition vector method, we also dwell on the sequence pattern queries, the implementation with its reasoning before we finally wrap up with a discussion which we are sure will kindle some more thoughts and views to progress this work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call