Spatial transcriptomics is a rapidly growing field that aims to comprehensively characterize tissue organization and architecture at single-cell or sub-cellular resolution using spatial information. Such techniques provide a solid foundation for the mechanistic understanding of many biological processes in both health and disease that cannot be obtained using traditional technologies. Several methods have been proposed to decipher the spatial context of spots in tissue using spatial information. However, when spatial information and gene expression profiles are integrated, most methods only consider the local similarity of spatial information. As they do not consider the global semantic structure, spatial domain identification methods encounter poor or over-smoothed clusters. We developed ConSpaS, a novel node representation learning framework that precisely deciphers spatial domains by integrating local and global similarities based on graph autoencoder (GAE) and contrastive learning (CL). The GAE effectively integrates spatial information using local similarity and gene expression profiles, thereby ensuring that cluster assignment is spatially continuous. To improve the characterization of the global similarity of gene expression data, we adopt CL to consider the global semantic information. We propose an augmentation-free mechanism to construct global positive samples and use a semi-easy sampling strategy to define negative samples. We validated ConSpaS on multiple tissue types and technology platforms by comparing it with existing typical methods. The experimental results confirmed that ConSpaS effectively improved the identification accuracy of spatial domains with biologically meaningful spatial patterns, and denoised gene expression data while maintaining the spatial expression pattern. Furthermore, our proposed method better depicted the spatial trajectory by integrating local and global similarities.
Read full abstract