Abstract

Given a set of nucleotide sequences we consider the problem of identifying conserved substrings occurring in homologous genes in a large number of sequences. The problem is solved by identifying certain nodes in a suffix tree containing all substrings occurring in the given nucleotide sequences. Due to the large size of the targeted data set, our approach employs a truncated version of suffix trees. Two methods for this task are introduced: (1) The annotation guided marker detection method uses gene annotations which might contain a moderate number of errors; (2) The probability based marker detection method determines sequences that appear significantly more often than expected. The approach is successfully applied to the mitochondrial nucleotide sequences, and the corresponding annotations that are available in RefSeq for 2989 metazoan species. We demonstrate that the approach finds appropriate substrings.

Highlights

  • Mitochondria are organelles that fulfill vital functions in eukaryotic cells

  • The two marker detection methods that are introduced use the generalized suffix tree data structure that is described in Subsection 2.1

  • The annotation guided and probability based marker detection methods were run on the 2989 mitochondrial genome sequences contained in RefSeq and their reverse complements

Read more

Summary

Introduction

Mitochondria are organelles that fulfill vital functions in eukaryotic cells. They produce adenosine triphosphate which is an important carrier of chemical energy. Mitochondria are thought to have their evolutionary origin in α-proteobacteria which have been integrated into cells by endosymbiosis. Mitochondria inherited a genome (mitogenome) from their bacterial ancestors, which has been reduced dramatically in most lineages (see [1] for an overview). The typical metazoan mitogenome is circular and usually comprises approximately 16.5 k nucleotides [2]. It comprises a nearly perfectly preserved gene content, consisting of 37 genes including

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.