Abstract

The transcription machinery of archaea can be roughly classified as a simplified version of eukaryotic organisms. The basal transcription factor machinery binds to the TATA box found around 28 nucleotides upstream of the transcription start site; however, some transcription units lack a clear TATA box and still have TBP/TFB binding over them. This apparent absence of conserved sequences could be a consequence of sequence divergence associated with the upstream region, operon, and gene organization. Furthermore, earlier studies have found that a structural analysis gains more information compared with a simple sequence inspection. In this work, we evaluated and coded 3630 archaeal promoter sequences of three organisms, Haloferax volcanii, Thermococcus kodakarensis, and Sulfolobus solfataricus into DNA duplex stability, enthalpy, curvature, and bendability parameters. We also split our dataset into conserved TATA and degenerated TATA promoters to identify differences among these two classes of promoters. The structural analysis reveals variations in archaeal promoter architecture, that is, a distinctive signal is observed in the TFB, TBP, and TFE binding sites independently of these being TATA‐conserved or TATA‐degenerated. In addition, the promoter encountering method was validated with upstream regions of 13 other archaea, suggesting that there might be promoter sequences among them. Therefore, we suggest a novel method for locating promoters within the genome of archaea based on DNA energetic/structural features.

Highlights

  • Archaea represent the third domain of life (Woese, 1987) and include an essential and vast variety of organisms with a large diversity of habitats and lifestyles

  • The initiation process begins with the binding of a TATA-­binding protein (TBP) and a transcription factor B (TFB) to a specific DNA segment, defined as a promoter, allowing the recruitment of the RNA polymerase (RNAP) enzyme

  • Three main conserved DNA elements devoted to the transcription process have been identified as common to all archaeal groups: (i) an initiator element (INR) around the transcription start site (TSS); (ii) the TATA box element, centered around −26/27 relative to the TSS; and (iii) an element upstream the TATA box comprising two adenines at −34 and −33, which is designated as “transcription factor B recognition element” (BRE)

Read more

Summary

| INTRODUCTION

Archaea represent the third domain of life (Woese, 1987) and include an essential and vast variety of organisms with a large diversity of habitats and lifestyles. These features are biologically relevant to characterize promoter regions since they convert DNA information into numeric attributes (Benham, 1996) These four parameters have previously been used and reflect in capturing specific signals that are not evident at the sequence level (Bansal et al, 2014; de Avila e Silva et al, 2011; Kanhere & Bansal, 2005; SantaLucia & Hicks, 2004; Yella & Bansal, 2017; Yella et al, 2018). A key motif for this research would be located in −27/-­28, so the search was directed to this specific region to capture the TATAs. The following parameters on MEME were used in the organisms H. volcanii and T. kodakarensis: i) 100 nucleotides sequence length, considering the −80 to +20 region, where the core promoter is located (Haberle & Stark, 2018; Kadonaga, 2012); ii) a 0-­order background model generated from. The dataset was found not to be normally distributed through the rejection of the null hypothesis

| RESULTS
| DISCUSSION
Findings
| CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call