The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription-factor association, chromatin structure and histone modification. In this overview, the Consortium guides the readers through the project itself, the data and their integrated analyses. Eighty per cent of the human genome now has at least one biochemical function assigned to it. In addition to expanding our understanding of how gene expression is regulated on a genome-wide scale, the newly identified functional elements should help researchers to interpret the results of genome-wide associated studies because many correspond to sites associated with human disease. This paper describes the first extensive map of human DNaseI hypersensitive sites — markers of regulatory DNA — in 125 diverse cell and tissue types. Integration of this information with other data sets generated by ENCODE (Encyclopedia of DNA Elements) identified new relationships between chromatin accessibility, transcription, DNA methylation and regulatory-factor occupancy patterns. Evolutionary-conservation analysis revealed signatures of recent functional constraint within DNaseI hypersensitive sites. DNaseI footprinting detects DNA sequences that are protected from cleavage by DNaseI because they are bound by regulatory factors. Studying these footprints in 41 diverse cell and tissue types, the authors describe millions of short sequence elements that are conserved recognition sequences for DNA-binding proteins. The effort nearly doubles the size of the human cis-regulatory lexicon and provides insight into chromatin states and levels of evolutionary conservation. A large collection of novel regulatory-factor recognition motifs that closely parallel major regulators of development, differentiation and pluripotency is also described. This manuscript describes the effort of the ENCODE (Encyclopedia of DNA Elements) Consortium to examine the principles of human transcriptional regulatory networks, using a subset of 119 transcription factors. The results are integrated with other genomic information to form a multi-level meta-network in which different levels have distinct properties. The findings will aid future interpretations of human genomics and help us to understand the basic principles of human biology and disease. These authors describe the ENCODE (Encyclopedia of DNA Elements) effort to provide a complete catalogue of primary and processed RNAs found either in specific sub-cellular compartments or throughout the cell. They show that three-quarters of the human genome can be transcribed, and provide a wealth of information about the range and levels of expression, localization, processing fates and modifications of both known and previously unannotated RNAs. Collectively, these observations suggest that the current concept of a gene should be revisited. In this ENCODE (Encyclopedia of DNA Elements) manuscript, the authors use chromosome conformation capture carbon copy (5C2) to look at the relationships between functional elements and distal target genes in 1% of the human genome in three dimensions. They describe numerous long-range interactions between promoters and distal sites that include elements resembling enhancers, promoters and CTCF-bound sites, their genomic distribution and complex interactions. Because only ∼7% of looping interactions are with the nearest gene, genomic proximity is not a simple predictor for long-range interactions.
Read full abstract