Abstract

The TATA box has been used successfully to identify a transcription start site (TSS) and thereby a promoter. Unfortunately, there are many substrings which fit the profile of a TATA box and such substrings are called putative TATA boxes. We have applied linear and non linear classifiers for discriminating TATA box from putative TATA boxes and have compared their performances. We have also investigated the influence of the length of the pair of sequences flanking a putative TATA box on the prediction accuracy. The techniques we have presented in this paper are general enough to be applicable to other domains or to other genomes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call