Abstract

BackgroundIt is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle.ResultsThis program first creates a 2-dimensional nucleotide display by assigning four colors to the four nucleotides, and then text-wraps to a user adjustable width. This nucleotide display is accompanied by a "repeat map" which comprehensively displays all local repeating units, based upon analysis of all possible local alignments. Skittle includes a smooth-zooming interface which allows the user to analyze genomic patterns at any scale.Skittle is especially useful in identifying and analyzing tandem repeats, including repeats not normally detectable by other methods. However, Skittle is also more generally useful for analysis of any genomic data, allowing users to correlate published annotations and observable visual patterns, and allowing for sequence and construct quality control.ConclusionsPreliminary observations using Skittle reveal intriguing genomic patterns not otherwise obvious, including structured variations inside tandem repeats. The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus.

Highlights

  • It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history

  • The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus

  • 1) Detecting Patterns of Nucleotide Composition Using the Nucleotide Display, it is immediately obvious that most genomes are made up of regions strongly dominated by specific nucleotides, and this is seen at all scales

Read more

Summary

Introduction

It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required To partially address this need, we introduce a new data visualization tool entitled Skittle. Recent discoveries are changing our appreciation of the complexity of genomic information This includes strong conservation in non-coding regions, large scale transcription from areas previously considered to be non-functional DNA, bidirectional promoters, and alternative splicing patterns [1]. SpectroFish [7] is in some ways similar to Skittle, but focuses on Fourier analysis While it is adept at identifying open reading frames, it lacks a direct visualization of the raw data and users miss other interesting features (e.g., nucleotide bias). The view was fixed at a width of 3500 nucleotides This meant that, while they properly visualized the changing nucleotide bias, the only repeats that were recognizable were factors of 3500. Skittle expands upon this functionality by implementing a variable width so that all tandem repeats can be visualized

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call