The internal functional organization of cis-regulatory modules (CRMs) lies at the heart of our understanding the mode and tempo of gene regulatory evolution as well as practical efforts at deciphering and annotating genomic sequences. In an open-ended search for loose clusters of known mesodermal enhancer motifs in the Ciona intestinalis genome, I discovered the existence of a class of highly organized CRMs in otherwise unrelated genes expressed early in development. Each such CRM is composed of distinct motifs located at specific positions along approximately 160 bp of DNA sequence, and is able to drive expression in distinct mesodermal compartments descended from the B4.1 blastomere. The majority of the loci bearing these B4.1-specific modules encode important early mesodermal transcription factors at the snail, paraxis, and tbx6 orthologous loci of this invertebrate chordate system. These unrelated genes encode members of the C2H2 zinc-finger, bHLH, and T-box transcription factor families, and likely serve as a chordate-specific trans-code for paraxial mesoderm. One other similarly organized enhancer was discovered in the TNC3 muscle structural locus. These results suggest that organization of binding sites over the length of the enhancer sequence is a critical aspect of gene regulatory biology. The extent to which this is a general principle will facilitate our ability to identify, decipher, and categorize the regulatory functions contained in whole genome assemblies.
Read full abstract