Abstract

High-throughput, spatially resolved gene expression techniques are poised to be transformative across biology by overcoming a central limitation in single-cell biology: the lack of information on relationships that organize the cells into the functional groupings characteristic of tissues in complex multicellular organisms. Spatial expression is particularly interesting in the mammalian brain, which has a highly defined structure, strong spatial constraint in its organization, and detailed multimodal phenotypes for cells and ensembles of cells that can be linked to mesoscale properties such as projection patterns, and from there, to circuits generating behavior. However, as with any type of expression data, cross-dataset benchmarking of spatial data is a crucial first step. Here, we assess the replicability, with reference to canonical brain subdivisions, between the Allen Institute's in situ hybridization data from the adult mouse brain (Allen Brain Atlas (ABA)) and a similar dataset collected using spatial transcriptomics (ST). With the advent of tractable spatial techniques, for the first time, we are able to benchmark the Allen Institute's whole-brain, whole-transcriptome spatial expression dataset with a second independent dataset that similarly spans the whole brain and transcriptome. We use regularized linear regression (LASSO), linear regression, and correlation-based feature selection in a supervised learning framework to classify expression samples relative to their assayed location. We show that Allen Reference Atlas labels are classifiable using transcription in both data sets, but that performance is higher in the ABA than in ST. Furthermore, models trained in one dataset and tested in the opposite dataset do not reproduce classification performance bidirectionally. While an identifying expression profile can be found for a given brain area, it does not generalize to the opposite dataset. In general, we found that canonical brain area labels are classifiable in gene expression space within dataset and that our observed performance is not merely reflecting physical distance in the brain. However, we also show that cross-platform classification is not robust. Emerging spatial datasets from the mouse brain will allow further characterization of cross-dataset replicability ultimately providing a valuable reference set for understanding the cell biology of the brain.

Highlights

  • In the last 5 years, there has been an explosion of spatially resolved transcriptomics techniques that have made it possible to sequence whole transcriptomes while retaining fine-scale spatial information [1,2,3,4,5]

  • We show that Allen Reference Atlas labels are classifiable using transcription in both data sets, but that performance is higher in the Allen Brain Atlas (ABA) than in spatial transcriptomics (ST)

  • Allen Reference Atlas brain areas are classifiable using gene expression alone With the advent of new high-throughput capture technologies for ST, we present, as is necessary for all new biological assays, a cross-technology assessment of generalizability in a wellcharacterized model system: the adult mouse brain

Read more

Summary

Introduction

In the last 5 years, there has been an explosion of spatially resolved transcriptomics techniques that have made it possible to sequence whole transcriptomes while retaining fine-scale spatial information [1,2,3,4,5]. These new technologies are poised to be transformative across biology [6]. Given the potential of spatial transcriptomics (ST) approaches in neuhad no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call