Abstract

Many DNA methylation (DNAm) data are from tissues composed of various cell types, and hence cell deconvolution methods are needed to infer their cell compositions accurately. However, a bottleneck for DNAm data is the lack of cell-type-specific DNAm references. On the other hand, scRNA-seq data are being accumulated rapidly with various cell-type transcriptomic signatures characterized, and also, many paired bulk RNA-DNAm data are publicly available currently. Hence, we developed the R package scDeconv to use these resources to solve the reference deficiency problem of DNAm data and deconvolve them from scRNA-seq data in a trans-omics manner. It assumes that paired samples have similar cell compositions. So the cell content information deconvolved from the scRNA-seq and paired RNA data can be transferred to the paired DNAm samples. Then an ensemble model is trained to fit these cell contents with DNAm features and adjust the paired RNA deconvolution in a co-training manner. Finally, the model can be used on other bulk DNAm data to predict their relative cell-type abundances. The effectiveness of this method is proved by its accurate deconvolution on the three testing datasets here, and if given an appropriate paired dataset, scDeconv can also deconvolve other omics, such as ATAC-seq data. Furthermore, the package also contains other functions, such as identifying cell-type-specific inter-group differential features from bulk DNAm data. scDeconv is available at: https://github.com/yuabrahamliu/scDeconv.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call