Exploring the Interaction between Local and Global Latent Configurations for Clustering Single-Cell RNA-Seq: A Unified Perspective

Nairouz Mrabah,Mohamed Mahmoud Amar,Mohamed Bouguessa,Abdoulaye Banire Diallo

doi:10.1609/aaai.v37i8.26107

Abstract

The most recent approaches for clustering single-cell RNA-sequencing data rely on deep auto-encoders. However, three major challenges remain unaddressed. First, current models overlook the impact of the cumulative errors induced by the pseudo-supervised embedding clustering task (Feature Randomness). Second, existing methods neglect the effect of the strong competition between embedding clustering and reconstruction (Feature Drift). Third, the previous deep clustering models regularly fail to consider the topological information of the latent data, even though the local and global latent configurations can bring complementary views to the clustering task. To address these challenges, we propose a novel approach that explores the interaction between local and global latent configurations to progressively adjust the reconstruction and embedding clustering tasks. We elaborate a topological and probabilistic filter to mitigate Feature Randomness and a cell-cell graph structure and content correction mechanism to counteract Feature Drift. The Zero-Inflated Negative Binomial model is also integrated to capture the characteristics of gene expression profiles. We conduct detailed experiments on real-world datasets from multiple representative genome sequencing platforms. Our approach outperforms the state-of-the-art clustering methods in various evaluation metrics.

Full Text