Abstract

Mammals are composed of hundreds of different cell types with specialized functions. Each of these cellular phenotypes are controlled by different combinations of transcription factors. Using a human non islet cell insulinoma cell line (TC-YIK) which expresses insulin and the majority of known pancreatic beta cell specific genes as an example, we describe a general approach to identify key cell-type-specific transcription factors (TFs) and their direct and indirect targets. By ranking all human TFs by their level of enriched expression in TC-YIK relative to a broad collection of samples (FANTOM5), we confirmed known key regulators of pancreatic function and development. Systematic siRNA mediated perturbation of these TFs followed by qRT-PCR revealed their interconnections with NEUROD1 at the top of the regulation hierarchy and its depletion drastically reducing insulin levels. For 15 of the TF knock-downs (KD), we then used Cap Analysis of Gene Expression (CAGE) to identify thousands of their targets genome-wide (KD-CAGE). The data confirm NEUROD1 as a key positive regulator in the transcriptional regulatory network (TRN), and ISL1, and PROX1 as antagonists. As a complimentary approach we used ChIP-seq on four of these factors to identify NEUROD1, LMX1A, PAX6, and RFX6 binding sites in the human genome. Examining the overlap between genes perturbed in the KD-CAGE experiments and genes with a ChIP-seq peak within 50 kb of their promoter, we identified direct transcriptional targets of these TFs. Integration of KD-CAGE and ChIP-seq data shows that both NEUROD1 and LMX1A work as the main transcriptional activators. In the core TRN (i.e., TF-TF only), NEUROD1 directly transcriptionally activates the pancreatic TFs HSF4, INSM1, MLXIPL, MYT1, NKX6-3, ONECUT2, PAX4, PROX1, RFX6, ST18, DACH1, and SHOX2, while LMX1A directly transcriptionally activates DACH1, SHOX2, PAX6, and PDX1. Analysis of these complementary datasets suggests the need for caution in interpreting ChIP-seq datasets. (1) A large fraction of binding sites are at distal enhancer sites and cannot be directly associated to their targets, without chromatin conformation data. (2) Many peaks may be non-functional: even when there is a peak at a promoter, the expression of the gene may not be affected in the matching perturbation experiment.

Highlights

  • Regulation of gene expression by combinations of transcription factors (TFs) is a fundamental process that determines cellular identity and functions

  • A systematic review of endocrine hormones and peptides detected in TC-YIK confirmed chromogranin A (CHGA) and GAST were expressed at high levels and revealed expression of insulin (INS), ghrelin (GHRL), and transthyretin (TTR; Table 1)

  • To understand which TFs are responsible for maintaining the TC-YIK cell state, we identified a set of 4639 promoters with enriched expression (>3-fold) in TC-YIK compared to median expression in FANTOM5

Read more

Summary

Introduction

Regulation of gene expression by combinations of transcription factors (TFs) is a fundamental process that determines cellular identity and functions. Computational approaches that predict TF targets based upon their co-expression with a given TF and/or the presence of a transcription factor binding site motif (TFBS) in their promoter regions have helped to identify direct targets (Wasserman and Sandelin, 2004; Tompa et al, 2005; Valouev et al, 2008; FANTOM Consortium et al, 2009); these are purely predictive methods and the validation rate, when experimental validations are carried out, is low. Motif prediction methods are limited as the vast majority of our TFs have no well-defined TFBS, and TFs from the same family bind very similar motifs Even for those cases where a motif is known, the information content is so low that the majority of binding site predictions will likely be false positives (Wasserman and Sandelin, 2004). Unless the expression levels of the TFs themselves are taken into consideration, inaccurate predictions can be made where a binding event may be predicted as important despite the fact that the corresponding TF is not even present in the cell

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call