Abstract

In this paper, we have developed an intelligent system for searching comparative genomic sequences which departs from the traditional sequence alignment methods of nucleic residues or alphabets. Instead, we use the composition vector method that exploits pattern structures in sequences and indexing techniques for building a genomic database of prokaryotic organisms and their phylogenetic relationships. For the structural analysis of prokaryotic patterns, we use this composition vector to express various fuzzy sequence pattern queries on genomic data that would be difficult to represent in traditional database technology. B.L. Hao and his group have used the composition vector method to construct a phylogenetic tree of prokaryotes to understand the evolutionary history of prokaryotic organisms. The composition vector method is based on counting the frequency of nucleotides of a fixed length K in the collection of gene sequences of each species. This method transforms variable length sequences to a fixed length vector. In addition to elaborating on the composition vector method, we also dwell on the sequence pattern queries, the implementation with its reasoning before we finally wrap up with a discussion which we are sure will kindle some more thoughts and views to progress this work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.