Abstract

Although one would normally expect a given regulatory element to perform best when it fully matches its consensus sequence, this is generally far from being the case. Usually, almost none of the actual sites fits the consensus exactly, and some of those that do fit do not perform well. The main reason for that is the very nature of the sequences and the messages (codes) they contain. Normally, any given stretch of the sequence with one or another regulatory site not only carries this regulatory message, but several more messages of various types as well. These messages overlap with the regulatory element in such a way that the letter (base) which actually appears in any given sequence position simultaneously belongs to one or more additional codes. Apart from numerous individual codes (sequence patterns) specific for a given species or gene, there are many different general (universal) sequence codes all interacting with one another. These are the classical triplet code, DNA shape code, chromatin code, gene splicing code, modulation code and many more, including those that have not yet been discovered. Examples of overlapping of different codes and their interaction are discussed, as well as the role of degeneracy of the codes and the sequence complexity as a function of code density.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call