Abstract

The domain of transcription regulation has been notoriously difficult to annotate in the Gene Ontology, partly because of the intricacies of gene regulation which involve molecular interactions with DNA as well as amongst protein complexes. The molecular function ‘transcription coregulator activity’ is a part of the biological process ‘regulation of transcription, DNA-templated’ that occurs in the cellular component ‘chromatin’. It can mechanistically link sequence-specific DNA-binding transcription factor (dbTF) regulatory DNA target sites to coactivator and corepressor target sites through the molecular function ‘cis-regulatory region sequence-specific DNA binding’. Many questions arise about transcription coregulators (coTF). Here, we asked how many unannotated, putative coregulators can be identified in protein complexes? Therefore, we mined the CORUM and hu.MAP protein complex databases with known and strongly presumed human transcription coregulators. In addition, we trawled the BioGRID and IntAct molecular interaction databases for interactors of the known 1457 human dbTFs annotated by the GREEKC and GO consortia. This yielded 1093 putative transcription factor coregulator complex subunits, of which 954 interact directly with a dbTF. This substantially expands the set of coTFs that could be annotated to ‘transcription coregulator activity’ and sets the stage for renewed annotation and wet-lab research efforts. To this end, we devised a prioritisation score based on existing GO annotations of already curated transcription coregulators as well as interactome representation. Since all the proteins that we mined are parts of protein complexes, we propose to concomitantly engage in annotation of the putative transcription coregulator-containing complexes in the Complex Portal database.

Highlights

  • In its simplest form, eukaryotic transcription initiation only requires the general transcription machinery

  • The accessory links we find between coTFs and DNA-binding transcription factor (dbTF) that may punctually recruit, activate or repress coTF activity at target genes form a basis to link genomic transcription regulatory chromosomal DNA sequences to the transcription regulatory proteins that act on them, and enable mechanistic modelling of the signal transduction pathways that lead to the process of transcriptional gene regulation

  • When for every unique combination of ‘bait protein – prey protein’ that occurs on the final list we check in which sources this combination was found, the combinations found in the Complex Portal largely overlap with those found in hu.MAP and CORUM (Fig. 4B)

Read more

Summary

Introduction

In its simplest form, eukaryotic transcription initiation only requires the general transcription machinery. This machinery consists of RNA polymerase II and the general transcription initiation factors (GTFs) TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH [1,2] These bind promoter DNA sequences and promoter DNA-bound proteins and recruit the RNA polymerase to form the transcription preinitiation complex (PIC) [3]. Our results should empower wet-lab approaches as well as curation of currently known protein complexes that harbour coTFs. The accessory links we find between coTFs and dbTFs that may punctually recruit, activate or repress coTF activity at target genes form a basis to link genomic transcription regulatory chromosomal DNA sequences to the transcription regulatory proteins that act on them, and enable mechanistic modelling of the signal transduction pathways that lead to the process of transcriptional gene regulation. We propose to concomitantly engage in annotation of the putative transcription coregulator-containing complexes in the Complex Portal [28], since it already hosts many canonical transcription regula­ tory protein complexes and their annotations

Code and data source versions
Sequential expansion of three bait lists
Mining the data sources
GO score
Mining score
Prioritisation score
Blacklist filter
A cybernetic screen for putative coTF complex subunits
Inclusion of putative coTFs by mining the BioGRID and IntAct databases
List of putative coTFs for targeted annotation
Devising prioritisation scoring systems
Findings
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call