Recent progress in multiplexed tissue imaging is advancing the study of tumor microenvironments to enhance our understanding of treatment response and disease progression. Cellular neighborhood analysis is a popular computational approach for these complex image data. Despite its popularity, there are significant challenges, including high computational demands that limit feasibility for largescale applications and the lack of a principled strategy for integrative analysis across images. This absence hampers the precise and consistent identification of spatial features and tracking of their dynamics over disease progression. To overcome these challenges, we introduce SpatialTopic , a spatial topic model designed to decode high-level spatial architecture across multiplexed tissue images. SpatialTopic integrates both cell type and spatial information within a topic modelling framework, originally developed for natural language processing and adapted for computer vision. Spatial information is incorporated into the flexible design of documents, representing densely overlapping regions in images. We employ an efficient collapsed Gibbs sampling algorithm for model inference. We benchmarked the performance against five state-of-the-art algorithms through various case studies using different single-cell spatial transcriptomic and proteomic imaging platforms across different tissue types. We show that SpatialTopic is highly scalable on large-scale image datasets with millions of cells, along with high precision and interpretability. Our findings demonstrate that SpatialTopic consistently identifies biologically and clinically significant spatial "topics" such as tertiary lymphoid structures (TLSs) and tracks dynamic changes in spatial features over disease progression. Its computational efficiency and broad applicability across various molecular imaging platforms will enhance the analysis of large-scale tissue imaging datasets.
Read full abstract