Abstract

<h3>Abstract</h3> Recently a biochemistry experiment was developed to simultaneously capture the chromosomal conformations and DNA methylation levels on single cells. A computational tool to predict single-cell methylation levels based on single-cell Hi-C data becomes necessary due to the availability of this experiment. scHiMe was developed to predict the base-pair-specific methylation levels in the promoter regions genome-wide based on the single-cell Hi-C data and DNA nucleotide sequences using the graph transformer algorithm. Promoter-promoter spatial interaction networks were built based on single-cell Hi-C data, and single-cell DNA methylation levels on 1000 base pairs for each promoter were predicted based on the network topology and DNA sequence. Our evaluation results showed a high consistency between the predicted and the true methylation values. We tested using predicted DNA methylation levels on all promoters to classify cells into different cell types, and our results showed that the predicted DNA methylation levels resulted in almost perfect cell-type classification, which indicated that our predictions maintained the cell-to-cell variability. We also tested using the predicted DNA methylation levels of different subsets of promoters and different subsets of CpGs in promoters to classify cells and provided the promoters and CpGs that were most influential in cell-type clustering. Moreover, we observed slightly better performance for the nodes that have higher degree values in the promoter-promoter spatial interaction network but did not find a similar trend for more significant network influencers. Last but not least, we found that using the predicted methylation levels of only housekeeping genes led to less accurate cell-type clustering, which demonstrated that our methylation predictions fit the biological meanings of housekeeping genes since housekeeping genes usually have constant and similar genetic and epigenetic features among different types of cells. scHiMe is freely available at http://dna.cs.miami.edu/scHiMe/. <h3>Author Summary</h3> DNA methylation is the process of adding methyl groups to the DNA molecule without changing the nucleotide sequence, which can significantly change the activities of the DNA. Although DNA is a one-dimensional long sequence consisting of cytosine (C), guanine (G), adenine (A), and thymine (T), it folds into a three-dimensional structure in the nucleus of the cell. Scientists believe that this 3D structure has relationships with the activities of the DNA. This research uses deep learning to predict the DNA methylation status of each cytosine-guanine pair in the promoter regions of all human genes based on the 3D structure of the DNA. This not only adds a useful method to the computational biology field but also proves that 3D genome structure does have a relationship with DNA methylation. Moreover, the 3D genome and the DNA methylation are not based on a population of cells, but on individual cells, which provides a cell-specific perspective that allows the understanding of cell-to-cell variability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call