VARGG: a deep learning framework advancing precise spatial domain identification and cellular heterogeneity analysis in spatial transcriptomics

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Spatial transcriptomics has revolutionized our ability to measure gene expression while preserving spatial information, thus facilitating detailed analysis of tissue structure and function. Identifying spatial domains accurately is key for understanding tissue microenvironments and biological progression. To overcome the challenge of integrating gene expression data with spatial information, we introduce the VARGG deep learning framework. VARGG combines a pretrained Vision Transformer (ViT) with a graph neural network autoencoder, utilizing ViT’s self-attention mechanism to capture global contextual information and enhance understanding of spatial relationships. This framework is further enhanced by multi-layer gated residual graph neural networks and Gaussian noise, which improve feature representation and model generalizability across different data sources. The robustness and scalability of VARGG have been verified on different platforms (10x Visium, Slide-seqV2, Stereo-seq, and MERFISH) and datasets of different sizes (human glioblastoma, mouse embryo, breast cancer). Our results demonstrate that VARGG’s ability to accurately delineate spatial domains can provide a deeper understanding of tissue structure and help identify key molecular markers and potential therapeutic targets, thereby improving our understanding of disease mechanisms and providing opportunities for personalization to inform the development of treatment strategies.

Similar Papers
  • Research Article
  • Cite Count Icon 31
  • 10.1016/j.compbiomed.2023.107440
STGNNks: Identifying cell types in spatial transcriptomics data based on graph neural network, denoising auto-encoder, and [formula omitted]-sums clustering
  • Sep 9, 2023
  • Computers in Biology and Medicine
  • Lihong Peng + 4 more

STGNNks: Identifying cell types in spatial transcriptomics data based on graph neural network, denoising auto-encoder, and [formula omitted]-sums clustering

  • Research Article
  • Cite Count Icon 6
  • 10.1038/s42003-024-07037-0
Graph attention automatic encoder based on contrastive learning for domain recognition of spatial transcriptomics
  • Oct 18, 2024
  • Communications Biology
  • Tianqi Wang + 6 more

Spatial transcriptomics is an emerging technology that enables the profiling of gene expression in tissues while preserving spatial location information. This innovative approach is anticipated to provide a comprehensive understanding of the spatial distribution of different cells within tissues and facilitate in-depth analysis of tissue structure. To accurately recognize spatial domains from spatial transcriptomics, we have introduced a generalized deep learning method called GAAEST (Graph Attention-based Autoencoder for Spatial Transcriptomics). Our proposed approach effectively integrates both spatial location information and gene expression data from spatial transcriptomics. Specifically, it leverages spatial location details to construct a neighborhood graph and employs a graph attention network-based encoder to embed gene expression information into a spatially informed space. At the same time, to further optimize the learned potential embedding, self-supervised contrastive learning is introduced to capture spatial information at three levels: local, global and contextual feature of spots. Finally, the decoder reconstructs gene expressions, which are then clustered to identify spatial domains with similar expression patterns and spatial proximity. Based on our experiments conducted on multiple datasets, GAAEST consistently outperforms existing state-of-the-art methods. The proposed GAAEST demonstrates excellent capabilities in spatial domain recognition, positioning it as an ideal tool for advancing spatial transcriptomics research.

  • Research Article
  • Cite Count Icon 5
  • 10.1093/gigascience/giae003
Deciphering spatial domains from spatially resolved transcriptomics with Siamese graph autoencoder.
  • Jan 2, 2024
  • GigaScience
  • Lei Cao + 12 more

Cell clustering is a pivotal aspect of spatial transcriptomics (ST) data analysis as it forms the foundation for subsequent data mining. Recent advances in spatial domain identification have leveraged graph neural network (GNN) approaches in conjunction with spatial transcriptomics data. However, such GNN-based methods suffer from representation collapse, wherein all spatial spots are projected onto a singular representation. Consequently, the discriminative capability of individual representation feature is limited, leading to suboptimal clustering performance. To address this issue, we proposed SGAE, a novel framework for spatial domain identification, incorporating the power of the Siamese graph autoencoder. SGAE mitigates the information correlation at both sample and feature levels, thus improving the representation discrimination. We adapted this framework to ST analysis by constructing a graph based on both gene expression and spatial information. SGAE outperformed alternative methods by its effectiveness in capturing spatial patterns and generating high-quality clusters, as evaluated by the Adjusted Rand Index, Normalized Mutual Information, and Fowlkes-Mallows Index. Moreover, the clustering results derived from SGAE can be further utilized in the identification of 3-dimensional (3D) Drosophila embryonic structure with enhanced accuracy. Benchmarking results from various ST datasets generated by diverse platforms demonstrate compelling evidence for the effectiveness of SGAE against other ST clustering methods. Specifically, SGAE exhibits potential for extension and application on multislice 3D reconstruction and tissue structure investigation. The source code and a collection of spatial clustering results can be accessed at https://github.com/STOmics/SGAE/.

  • Research Article
  • Cite Count Icon 2
  • 10.1101/2023.12.30.573739
SORBET: Automated cell-neighborhood analysis of spatial transcriptomics or proteomics for interpretable sample classification via GNN.
  • Apr 21, 2025
  • bioRxiv : the preprint server for biology
  • Shay Shimonov + 8 more

Spatial cellular profiling technologies have revolutionized our understanding of complex biological processes, from development and disease progression to immunity and aging. Despite their promise, integrating spatial information with multiplexed molecular data to accurately predict phenotypes poses significant challenges, especially in clinical settings. Here, we present SORBET, a geometric deep learning framework that directly analyzes complete spatial profiling data, eliminating the need to compress complete cell profiles into a limited set of annotations, such as cell types. SORBET models tissues as graphs of adjacent cells and applies graph convolutional networks to infer emergent phenotypes, such as responses to immunotherapy. The model leverages a novel data augmentation technique to ensure robust predictions, complemented by tailored interpretability analyses to identify the molecular and spatial patterns underlying the model's phenotype inferences. We apply our method to a CosMx spatial transcriptomics dataset of pre-treatment metastatic melanoma samples annotated with response to immunotherapy; we show that spatial information significantly improves clinical endpoint, or phenotype, prediction and identifies important biological patterns. To our knowledge, SORBET is the first example of phenotype prediction on spatial transcriptomics data. We further validated our method using two spatial proteomics datasets, Imaging Mass Cytometry (IMC) and Co-detection by indexing (CODEX), obtained from Non-Small Cell Lung Cancer and Colorectal Cancer samples, respectively. SORBET demonstrates superior accuracy in phenotype prediction over leading spatial and non-spatial methods across various datasets of different observed phenotypes and technologies. SORBET sets a new benchmark for predictive analysis in spatial omics, promising to advance personalized medicine through refined patient treatment stratification, grounded in molecular and spatial tissue profiling.

  • Research Article
  • Cite Count Icon 18
  • 10.1016/j.csbj.2023.11.055
A comprehensive overview of graph neural network-based approaches to clustering for spatial transcriptomics
  • Nov 30, 2023
  • Computational and Structural Biotechnology Journal
  • Teng Liu + 5 more

A comprehensive overview of graph neural network-based approaches to clustering for spatial transcriptomics

  • Research Article
  • 10.1093/bioinformatics/btae023
Assembling spatial clustering framework for heterogeneous spatial transcriptomics data with GRAPHDeep.
  • Jan 2, 2024
  • Bioinformatics
  • Teng Liu + 6 more

Spatial clustering is essential and challenging for spatial transcriptomics' data analysis to unravel tissue microenvironment and biological function. Graph neural networks are promising to address gene expression profiles and spatial location information in spatial transcriptomics to generate latent representations. However, choosing an appropriate graph deep learning module and graph neural network necessitates further exploration and investigation. In this article, we present GRAPHDeep to assemble a spatial clustering framework for heterogeneous spatial transcriptomics data. Through integrating 2 graph deep learning modules and 20 graph neural networks, the most appropriate combination is decided for each dataset. The constructed spatial clustering method is compared with state-of-the-art algorithms to demonstrate its effectiveness and superiority. The significant new findings include: (i) the number of genes or proteins of spatial omics data is quite crucial in spatial clustering algorithms; (ii) the variational graph autoencoder is more suitable for spatial clustering tasks than deep graph infomax module; (iii) UniMP, SAGE, SuperGAT, GATv2, GCN, and TAG are the recommended graph neural networks for spatial clustering tasks; and (iv) the used graph neural network in the existent spatial clustering frameworks is not the best candidate. This study could be regarded as desirable guidance for choosing an appropriate graph neural network for spatial clustering. The source code of GRAPHDeep is available at https://github.com/narutoten520/GRAPHDeep. The studied spatial omics data are available at https://zenodo.org/record/8141084.

  • Research Article
  • Cite Count Icon 4
  • 10.1093/bib/bbae578
SpaGIC: graph-informed clustering in spatial transcriptomics via self-supervised contrastive learning.
  • Sep 23, 2024
  • Briefings in bioinformatics
  • Wei Liu + 5 more

Spatial transcriptomics technologies enable the generation of gene expression profiles while preserving spatial context, providing the potential for in-depth understanding of spatial-specific tissue heterogeneity. Leveraging gene and spatial data effectively is fundamental to accurately identifying spatial domains in spatial transcriptomics analysis. However, many existing methods have not yet fully exploited the local neighborhood details within spatial information. To address this issue, we introduce SpaGIC, a novel graph-based deep learning framework integrating graph convolutional networks and self-supervised contrastive learning techniques. SpaGIC learns meaningful latent embeddings of spots by maximizing both edge-wise and local neighborhood-wise mutual information of graph structures, as well as minimizing the embedding distance between spatially adjacent spots. We evaluated SpaGIC on seven spatial transcriptomics datasets across various technology platforms. The experimental results demonstrated that SpaGIC consistently outperformed existing state-of-the-art methods in several tasks, such as spatial domain identification, data denoising, visualization, and trajectory inference. Additionally, SpaGIC is capable of performing joint analyses of multiple slices, further underscoring its versatility and effectiveness in spatial transcriptomics research.

  • Research Article
  • Cite Count Icon 42
  • 10.1093/bib/bbad048
Identifying spatial domain by adapting transcriptomics with histology through contrastive learning.
  • Feb 13, 2023
  • Briefings in Bioinformatics
  • Yuansong Zeng + 7 more

Recent advances in spatial transcriptomics have enabled measurements of gene expression at cell/spot resolution meanwhile retaining both the spatial information and the histology images of the tissues. Accurately identifying the spatial domains of spots is a vital step for various downstream tasks in spatial transcriptomics analysis. To remove noises in gene expression, several methods have been developed to combine histopathological images for data analysis of spatial transcriptomics. However, these methods either use the image only for the spatial relations for spots, or individually learn the embeddings of the gene expression and image without fully coupling the information. Here, we propose a novel method ConGI to accurately exploit spatial domains by adapting gene expression with histopathological images through contrastive learning. Specifically, we designed three contrastive loss functions within and between two modalities (the gene expression and image data) to learn the common representations. The learned representations are then used to cluster the spatial domains on both tumor and normal spatial transcriptomics datasets. ConGI was shown to outperform existing methods for the spatial domain identification. In addition, the learned representations have also been shown powerful for various downstream tasks, including trajectory inference, clustering, and visualization.

  • Research Article
  • 10.1109/tcbb.2024.3469164
Enhancing Spatial Domain Identification in Spatially Resolved Transcriptomics Using Graph Convolutional Networks With Adaptively Feature-Spatial Balance and Contrastive Learning.
  • Nov 1, 2024
  • IEEE/ACM transactions on computational biology and bioinformatics
  • Xuena Liang + 4 more

Recent advancements in spatially transcriptomics (ST) technologies have enabled the comprehensive measurement of gene expression profiles while preserving the spatial information of cells. Combining gene expression profiles and spatial information has been the most commonly used method to identify spatial functional domains and genes. However, most existing spatial domain decipherer methods are more focused on spatially neighboring structures and fail to take into account balancing the self-characteristics and the spatial structure dependency of spots. Therefore, we propose a novel model called SpaGCAC, which recognizes spatial domains with the help of an adaptive feature-spatial balanced graph convolutional network named AFSBGCN. The AFSBGCN can dynamically learn the relationship between spatial local topology structures and the self-characteristics of spots by adaptively increasing or declining the weight on the self-characteristics during message aggregation. Moreover, to better capture the local structures of spots, SpaGCAC exploits a local topology structure contrastive learning strategy. Meanwhile, SpaGCAC utilizes a probability distribution contrastive learning strategy to increase the similarity of probability distributions for points belonging to the same category. We validate the performance of SpaGCAC for spatial domain identification on four spatial transcriptomic datasets. In comparison with seven spatial domain recognition methods, SpaGCAC achieved the highest NMI median of 0.683 and the second highest ARI median of 0.559 on the multi-slice DLPFC dataset. SpaGCAC achieved the best results on all three other single-slice datasets. The above-mentioned results show that SpaGCAC outperforms most existing methods, providing enhanced insights into tissue heterogeneity.

  • Research Article
  • 10.1016/j.ymeth.2024.11.006
SpaInGNN: Enhanced clustering and integration of spatial transcriptomics based on refined graph neural networks
  • Nov 13, 2024
  • Methods
  • Fangqin Zhang + 4 more

SpaInGNN: Enhanced clustering and integration of spatial transcriptomics based on refined graph neural networks

  • Research Article
  • Cite Count Icon 11
  • 10.1093/bib/bbad500
StAA: adversarial graph autoencoder for spatial clustering task of spatially resolved transcriptomics.
  • Nov 22, 2023
  • Briefings in bioinformatics
  • Zhaoyu Fang + 5 more

With the development of spatially resolved transcriptomics technologies, it is now possible to explore the gene expression profiles of single cells while preserving their spatial context. Spatial clustering plays a key role in spatial transcriptome data analysis. In the past 2 years, several graph neural network-based methods have emerged, which significantly improved the accuracy of spatial clustering. However, accurately identifying the boundaries of spatial domains remains a challenging task. In this article, we propose stAA, an adversarial variational graph autoencoder, to identify spatial domain. stAA generates cell embedding by leveraging gene expression and spatial information using graph neural networks and enforces the distribution of cell embeddings to a prior distribution through Wasserstein distance. The adversarial training process can make cell embeddings better capture spatial domain information and more robust. Moreover, stAA incorporates global graph information into cell embeddings using labels generated by pre-clustering. Our experimental results show that stAA outperforms the state-of-the-art methods and achieves better clustering results across different profiling platforms and various resolutions. We also conducted numerous biological analyses and found that stAA can identify fine-grained structures in tissues, recognize different functional subtypes within tumors and accurately identify developmental trajectories.

  • Research Article
  • 10.1126/sciadv.adt7450
SOAR elucidates biological insights and empowers drug discovery through spatial transcriptomics.
  • Jun 13, 2025
  • Science advances
  • Yiming Li + 15 more

Spatial transcriptomics enables multiplex profiling of gene cellular expression and location within the tissue context. Although large volumes of spatial transcriptomics data have been generated, the lack of systematic curation and analysis limits biological discovery. We present Spatial transcriptOmics Analysis Resource (SOAR), a comprehensive spatial transcriptomics platform with 3461 uniformly processed samples across 13 species, 42 tissue types, and 19 different spatial transcriptomics technologies. Using SOAR, we found that CXCL16/SPP1 macrophage polarity characterizes the coordination of immune cell polarity in the tumor microenvironment. SOAR's integrative approach toward drug discovery revealed sirolimus and trichostatin A as potential anticancer agents targeting the phosphatidylinositol 3-kinase/Akt/mammalian target of rapamycin growth and proliferation pathway and identified Janus kinase/signal transducers and activators of transcription inhibitors for ulcerative colitis treatment. SOAR's results demonstrate its broad application to data generated from diverse spatial technologies and pathological conditions. SOAR will support future benchmarking studies and method development, facilitating discoveries in molecular functions, disease mechanisms, and potential therapeutic targets.

  • Research Article
  • Cite Count Icon 1
  • 10.1093/bib/bbae669
GAADE: identification spatially variable genes based on adaptive graph attention network.
  • Nov 22, 2024
  • Briefings in bioinformatics
  • Tianjiao Zhang + 7 more

The rapid advancement of spatial transcriptomics (ST) sequencing technology has made it possible to capture gene expression with spatial coordinate information at the cellular level. Although many methods in ST data analysis can detect spatially variable genes (SVGs), these methods often fail to identify genes with explicit spatial expression patterns due to the lack of consideration for spatial domains. Considering spatial domains is crucial for identifying SVGs as it focuses the analysis of gene expression changes on biologically relevant regions, aiding in the more accurate identification of SVGs associated with specific cell types. Existing methods for identifying SVGs based on spatial domains predefine spot similarity before training, which prevents adaptive learning and limits generalizability across different tissues or samples. This limitation may also lead to inaccurate identification of specific genes at boundary regions. To address these issues, we present GAADE, an unsupervised neural network architecture based on graph-structured data representation learning. GAADE stacks encoder/decoder layers and integrates a self-attention mechanism to reconstruct node attributes and graph structure, effectively capturing spatial domain structures of different sections. Consequently, we confine the identification of SVGs within spatial domains. By performing differential expression analysis on spots within the target spatial domain and their multi-order neighbors, GAADE detects genes with enriched expression patterns within defined domains. Comparative evaluations with five other popular methods on ST datasets across four different species, regions and tissues demonstrate that GAADE exhibits superior performance in detecting SVGs and capturing the extent of spatial gene expression variation.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 4
  • 10.1093/bioinformatics/btae451
Unraveling Spatial Domain Characterization in Spatially Resolved Transcriptomics with Robust Graph Contrastive Clustering.
  • Jul 1, 2024
  • Bioinformatics (Oxford, England)
  • Yingxi Zhang + 3 more

Spatial transcriptomics can quantify gene expression and its spatial distribution in tissues, thus revealing molecular mechanisms of cellular interactions underlying tissue heterogeneity, tissue regeneration, and spatially localized disease mechanisms. However, existing spatial clustering methods often fail to exploit the full potential of spatial information, resulting in inaccurate identification of spatial domains. In this paper, we develop a deep graph contrastive clustering framework, stDGCC, that accurately uncovers underlying spatial domains via explicitly modeling spatial information and gene expression profiles from spatial transcriptomics data. The stDGCC framework proposes a spatially informed graph node embedding model to preserve the topological information of spots and to learn the informative and discriminative characterization of spatial transcriptomics data through self-supervised contrastive learning. By simultaneously optimizing the contrastive learning loss, reconstruction loss, and Kullback-Leibler (KL) divergence loss, stDGCC achieves joint optimization of feature learning and topology structure preservation in an end-to-end manner. We validate the effectiveness of stDGCC on various spatial transcriptomics datasets acquired from different platforms, each with varying spatial resolutions. Our extensive experiments demonstrate the superiority of stDGCC over various state-of-the-art clustering methods in accurately identifying cellular-level biological structures. Code and data are available from https://github.com/TimE9527/stDGCC and https://figshare.com/projects/stDGCC/186525. Supplementary data are available at Bioinformatics online.

  • Abstract
  • 10.1182/blood-2024-205266
Mapping the Human Bone Marrow in Myeloproliferative Neoplasia Using Spatial Transcriptomics
  • Nov 5, 2024
  • Blood
  • Rosalin Cooper + 15 more

Mapping the Human Bone Marrow in Myeloproliferative Neoplasia Using Spatial Transcriptomics

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.