Increasing knowledge of chromatin structure in various cell types raises the challenge of deciphering the contribution of epigenetic modifications to the regulation of nuclear functions in mammals. In a recent study, we have analysed the genome-wide distributions of thirteen epigenetic marks in the human cell line K562 at 100 kb resolution of Mean Replication Timing (MRT) data. Using classical clustering techniques, we have shown that the combinatorial complexity of these epigenetic data can be reduced to four predominant chromatin states that replicate at different periods of the S-phase. C1 is an early replicating transcriptionally active euchromatin state, C2 a mid-S repressive type of chromatin associated with Polycomb complexes, C3 a silent chromatin with lack of chromatin marks that replicates later than C2 but before C4, a HP1-associated heterochromatin state that replicates at the end of S-phase. These four chromatin states display remarkable similarities with those recently reported in fly, worm and plants at higher ∼ 1 kb resolution of gene expression data. Here, we extend our integrative analysis of epigenetic data in the K562 human cell line to this smaller scale by focusing on gene promoters (±3 kb around transcription start sites). We show that these promoters can similarly be classified into four main chromatin states: P1 regroups all the marks of transcriptionally active chromatin and corresponds to CpG rich promoters of highly expressed genes; P2 is notably associated with the histone modification H3K27me3 that is the mark of a polycomb repressed chromatin state; P3 corresponds to promoters that are not enriched for any available marks as the signature of a ‘null’ or ‘black’ silent heterochromatin state and P4 characterizes the few gene promoters that contain only the constitutive heterochromatin histone modification H3K9me3. When investigating the coherence between promoter activity (P1, P2, P3 or P4) and the large-scale chromatin environment (C1, C2, C3 or C4), we find that the higher the gene density in a considered 100 kb-window, the higher (resp. the lower) the probability of a P1 active promoter (resp. silent P2, P3 and P4 promoters) to be surrounded by an open euchromatin C1 (resp. facultative C2, black C3 or HP1-associated C4 heterochromatin) environment. From large to small scales, it is mainly C4 and to a lesser extent C3 heterochromatin environments both corresponding to gene poor regions, that strongly conditions promoters to belong to the inactive P3 and P4 classes. If C1 (resp. C2) environment surrounds a majority of corresponding active P1 (resp. P2) promoters, it also contains a non-negligible proportion of inactive P2 and P3 (resp. active P1 and inactive P3) promoters. When further investigating the large-scale organization of human genes with respect to ‘master’ replication origins that were shown to border megabase-sized U-shaped MRT domains, we reveal some significant enrichment of highly expressed P1 genes in a closed neighbourhood of these early initiation zones consistently with the gradient of chromatin states observed from C1 at U-domain borders followed by C2, C3 and C4 at U-domain centers. On the contrary to P2 promoters that are mainly found in the C2 environment at finite distance (∼200–300 kb) from U-domain borders, the inactive P3 and P4 promoters are distributed rather homogeneously inside U-domains. The generalization of our study to different cell types including ES, somatic and cancer cells is likely to provide new insight on the global reorganization of replication domains during differentiation (or disease) in relation to coordinated changes in chromatin environment and gene expression.
Read full abstract