Abstract
Genome organization is critical for setting up the spatial environment of gene transcription, and substantial progress has been made towards its high-resolution characterization. The underlying molecular mechanism for its establishment is much less understood. We applied a deep-learning approach, variational autoencoder (VAE), to analyze the fluctuation and heterogeneity of chromatin structures revealed by single-cell imaging and to identify a reaction coordinate for chromatin folding. This coordinate connects the seemingly random structures observed in individual cohesin-depleted cells as intermediate states along a folding pathway that leads to the formation of topologically associating domains (TAD). We showed that folding into wild-type-like structures remain energetically favorable in cohesin-depleted cells, potentially as a result of the phase separation between the two chromatin segments with active and repressive histone marks. The energetic stabilization, however, is not strong enough to overcome the entropic penalty, leading to the formation of only partially folded structures and the disappearance of TADs from contact maps upon averaging. Our study suggests that machine learning techniques, when combined with rigorous statistical mechanical analysis, are powerful tools for analyzing structural ensembles of chromatin.
Highlights
Three-dimensional genome organization is expected to play a crucial role in transcription, DNA replication, and repair [1,2,3,4,5]
The dynamical process during which chromatin establishes its threedimensional organization for proper function, is of critical importance
Using a combination of deep learning and statistical mechanical theory, we demonstrate that great insight can be gained into the folding process by analyzing snapshots of chromatin structures taken across a population of cells
Summary
Three-dimensional genome organization is expected to play a crucial role in transcription, DNA replication, and repair [1,2,3,4,5]. Significant progress has been made towards its high-resolution characterization as a result of advances in chromosome-conformation-capture based methods such as Hi-C [6, 7] These methods approximate the 3D distance between pairs of genomic loci using contact frequencies measured via proximity ligation and have revealed many conserved features of genome packaging [8,9,10,11,12]. The extrusion model was proposed to explain numerous features of chromatin loops and TADs observed in Hi-C contact maps [16, 17]. It provides a detailed hypothesis on the folding process driven by CCCTC-binding factor (CTCF) and cohesin molecules [18,19,20]. Due to its unavoidable ensemble averaging, Hi-C cannot capture the heterogeneity within a cell population, and the average picture it presents may be insufficient to uncover the full complexity of genome folding [28, 29]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.