Abstract
BackgroundThe rise of spatial transcriptomics technologies is leading to new insights about how gene regulation happens in a spatial context. Determining which genes are expressed in similar spatial patterns can reveal gene regulatory relationships across cell types in a tissue. However, many current analysis methods do not take full advantage of the spatial organization of the data, instead treating pixels as independent features. Here, we present CoSTA: a novel approach to learn spatial similarities between gene expression matrices via convolutional neural network (ConvNet) clustering.ResultsBy analyzing simulated and previously published spatial transcriptomics data, we demonstrate that CoSTA learns spatial relationships between genes in a way that emphasizes broader spatial patterns rather than pixel-level correlation. CoSTA provides a quantitative measure of expression pattern similarity between each pair of genes rather than only classifying genes into categories. We find that CoSTA identifies narrower, but biologically relevant, sets of significantly related genes as compared to other approaches.ConclusionsThe deep learning CoSTA approach provides a different angle to spatial transcriptomics analysis by focusing on the shape of expression patterns, using more information about the positions of neighboring pixels than would an overlap or pixel correlation approach. CoSTA can be applied to any spatial transcriptomics data represented in matrix form and may have future applications to datasets such as histology in which images of different genes are from similar but not identical biological sections.
Highlights
The rise of spatial transcriptomics technologies is leading to new insights about how gene regulation happens in a spatial context
ConvNet learning strategy for Spatial Transcriptomics Analysis (CoSTA) architecture: training a convolutional neural network (ConvNet) with pseudo‐labels generated by GMM clustering Though there are many unsupervised learning strategies, we chose to apply the workflow of DeepCluster, because it is straightforward and easy to implement [6]
CoSTA is able to identify the spatial expression patterns of these genes, and reveals by quantitative similarity that these genes are more distantly related to cell type expression patterns than other genes. We examined both the significantly similar groups determined by CoSTA and used the spatial representation learned by CoSTA to measure Euclidean distances of these genes to each other and to cell type expression patterns (Fig. 2D, E and Additional file 14: Table S2)
Summary
The rise of spatial transcriptomics technologies is leading to new insights about how gene regulation happens in a spatial context. Different technologies have enabled high resolution measurements of how gene regulation is spatially organized across a tissue or thousands of single cells [1] Analyses of these data have the potential to reveal spatial regulatory relationships between genes. Previous analyses of Slide-seq data first identified spatially non-random gene expression, but looked for genes expressed in similar patterns using pixel-level overlap analysis rather than according to spatial features [3]. Existing algorithms for analysis of spatial transcriptomics are based on statistical modeling and primarily propose to distinguish spatially expressing or variable (SE or SV) genes from random spatial expression noise Both SpatialDE and SPARK analysis approaches estimate how significant the spatial pattern of a gene is [4, 5]. The similarity of expression pattern between two genes is either binary- whether or not the genes cluster together- or is quantified based on pixel-level correlation
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.