Abstract

In multi-cellular organisms development from zygote to adult and adaptation to different environmental stresses occur as cells acquire specialized roles by synthesizing proteins necessary for each task. In eukaryotes the most commonly used mechanism for maintaining cellular protein environment is transcriptional regulation of gene expression, by recruiting required transcription factors at promoter regions. Owing to the importance of transcriptional regulation, one of the main goals in the post-genomic era is to predict gene expression regulation on the basis of presence of transcription factor (TF) binding sites in the promoter regions. Genome wide knowledge of TF binding sites would be useful to build transcriptional regulatory networks model that result in cell specific differentiation. In eukaryotic genomes only a fraction (< 5%) of total genome codes for functional proteins or RNA, while remaining DNA sequences consist of non-coding regulatory sequences, other regions and sequences still with unknown functions. Since the discovery of trans-acting factors in gene regulation by Jacob and Monads in lac operon of E. coli, scientists had an interest in finding new transcription factors, their specific recognition and binding sequences. In DNAse footprinting (or DNase protection assay); transcription factor bound regions are protected from DNAse digestion, creating a footprint in a sequencing gel. This methodology has resulted in identification of hundreds of regulatory sequences. However, limitation of this methodology is that it requires the TF and promoter sequence (100-300 bp) in purified form. Our knowledge of known transcription factors is limited and recognition and binding sites are scattered over the complete genome. Therefore, in spite of high degree of accuracy in prediction of TF binding site, this methodology is not suitable for genome wide or across the genomes scanning. Detection of TF binding sites through phylogenetic footprinting is gradually becoming popular. It is based on the fact that random mutations are not easily accepted in functional sequences, while they continuously keep on tinkering non functional sequences. Many comparative genomics studies have revealed that during course of evolution regulatory elements remain conserved while the non-coding DNA sequences keep on mutating. With an ever increasing number of complete genome sequence from multiple organisms and mRNA profiling through microarray and deep sequencing technologies, wealth of gene expression data is being generated. This data can be used for identification of regulatory

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call