Abstract
Reconstructing three-dimensional (3D) chromosomal structures based on single-cell Hi-C data is a challenging scientific problem due to the extreme sparseness of the single-cell Hi-C data. In this research, we used the Lennard-Jones potential to reconstruct both 500 kb and high-resolution 50 kb chromosomal structures based on single-cell Hi-C data. A chromosome was represented by a string of 500 kb or 50 kb DNA beads and put into a 3D cubic lattice for simulations. A 2D Gaussian function was used to impute the sparse single-cell Hi-C contact matrices. We designed a novel loss function based on the Lennard-Jones potential, in which the value, i.e., the well depth, was used to indicate how stable the binding of every pair of beads is. For the bead pairs that have single-cell Hi-C contacts and their neighboring bead pairs, the loss function assigns them stronger binding stability. The Metropolis–Hastings algorithm was used to try different locations for the DNA beads, and simulated annealing was used to optimize the loss function. We proved the correctness and validness of the reconstructed 3D structures by evaluating the models according to multiple criteria and comparing the models with 3D-FISH data.
Highlights
Compared to the study of the genome from a one-dimensional (1D) or sequence perspective, the research of spatial folding of the 3D structure of the DNA has been increasingly recognized as important [1,2,3] because the spatial locations of genes, enhancers, promoters, and lncRNAs are essential for biological functions such as transcription, replication, DNA repair, and chromosome translocation [4,5]
The Lennard-Jones potential can be used to infer 3D chromosomal structures based on single-cell Hi-C data
The well depth of the Lennard-Jones potential can be used to model how easy it is to break the distances between bead pairs
Summary
Compared to the study of the genome from a one-dimensional (1D) or sequence perspective, the research of spatial folding of the 3D structure of the DNA has been increasingly recognized as important [1,2,3] because the spatial locations of genes, enhancers, promoters, and lncRNAs are essential for biological functions such as transcription, replication, DNA repair, and chromosome translocation [4,5]. Based on the spatial folding of the DNA, researchers have defined on average 1 Mb long topologically associating domains (TADs) on the genome [6], which can be the structural and functional units of the genome. Research such as the reconstruction of high-resolution 3D genome structures [7] and family classification of TADs [8] lead to a deeper understanding of how the genome functions. The lower the length of each bead is, the higher resolution the reconstructed
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.