Abstract

Accurate and consistent estimation of cell-to-cell similarity is crucial for clustering single-cell RNA-seq (scRNA-seq) data. However, the high sparsity of scRNA-seq data often leads to suboptimal mining and decreased accuracy in identifying cell types. Moreover, using a larger number of features (genes) does not necessarily improve clustering accuracy due to redundant information. In this paper, we propose a framework, called scMVFI (single-cell Multi-View Feature Integration), which integrates linear and non-linear features of scRNA-seq data to address the disadvantage of zero-inflated noise caused by technical factors. By employing an autoencoder model for data reconstruction, scMVFI performs multi-view similarity estimation using subsets of features with different sampling rates to identify highly similar cell pairs. We evaluate the effectiveness of scMVFI using five real scRNA-seq datasets and three simulated datasets. The results demonstrate that scMVFI can effectively mitigate the impact of data “dropout” events compared to other methods. Moreover, the affinity networks constructed from both linear and non-linear perspectives can accurately capture sample relationships, thereby enhancing the identification of cell types when combined with existing clustering methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call