Single-cell RNA sequencing (scRNA-seq) provides a powerful tool for dissecting cellular complexity and heterogeneity. However, its full potential to achieve statistically reliable conclusions is often constrained by the limited number of cells profiled, particularly in studies of rare diseases, specialized tissues, and uncommon cell types. Deep learning-based generative models (GMs) designed to address data scarcity often face similar limitations due to their reliance on pre-training or fine-tuning, inadvertently perpetuating a cycle of data inadequacy. To overcome this obstacle, we introduce scGFT (single-cell Generative Fourier Transformer), a train-free, cell-centric GM adept at synthesizing single cells that exhibit natural gene expression profiles present within authentic datasets. Using both simulated and experimental data, we demonstrate the mathematical rigor of scGFT and validate its ability to synthesize cells that preserve the intrinsic characteristics delineated in scRNA-seq data. Moreover, comparisons of scGFT with leading neural network-based GMs highlight its superior performance, driven by its analytical mechanism. By streamlining single-cell data augmentation, scGFT offers a scalable solution to mitigate data scarcity in cell-targeted research.
Read full abstract