In the post-genomic era, the volume of public sequence databases is increasing exponentially and visualisation-centric techniques have become more and more important in biological sequence analysis and annotation. In this paper, we present a methodology called dynamic visual data mining (DVDM), which combines biological object modelling, interactive display, and data analysis tools into one integrative platform. Using Java Development Kit v1.4, an object-oriented software named SeqVISTA has been developed based on DVDM. To illustrate the application of SeqVISTA, the following examples are shown: regular expression pattern matching; comparative analysis of alternative exon splicing patterns; Fourier analyses; exon prediction (MZEF and GENSCAN). Overall, we argue that DVDM is an important technique for biologists to unveil the information hidden behind the large genomic and proteomic databases, and SeqVISTA provides a versatile tool that integrates multiple computational algorithms for meeting biologists' data mining needs.
Read full abstract