Abstract

A tandem repeat is two or more contiguous, approtimate copies of a pattern of nucleotides. Tandem repeats occur frequently in the human genome. They have been shown to cause human disease, may play a variety of regulatory and evolutionary roles, and are important laboratory tools. Extensive knowledge about pattern sizes, copy number, mutational history, etc. for t.andem repeats has been limited because of the difficulty of detecting them in genomic sequence data. In this paper, me present a new algorithm for finding tandem repeats in DNA sequences without the need to specify either the pattern or pattern size. The algorithm is based on the detection of k-tuple matches. It uses a probabiitic model of tandem repeats and a collection of statistical criteria based on that modeL We demonstrate the algorithm’s speed and its abiity to detect tandem repeats that have undergone extensive mutational change by analyzing 4 sequences in the 2OOKb to 700Kb range.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.