Single-cell RNA-sequencing (scRNA-seq) enables gene expression profiling at single-cell resolution, but it loses the spatial information of cells for solid tissues during the tissue dissociation step before sequencing. In contrast, bulk spatial transcriptomics (ST) methods can measure the expression of spatially organized spots in solid tissues, but as a spot comprises dozens of cells, ST expression levels are averaged signals and lack cellular resolution. Joint analysis of these two complementary data types provides the opportunity to recover the spatial patterns of cell types and obtain the cellular enrichment of spots. However, there is a lack of unified statistical methods to achieve this goal. This study develops a Bayesian statistical method named BEATS to jointly model scRNA-seq data and bulk ST data from a common sample in the presence of cellular and spatial heterogeneity. BEATS can simultaneously (a) discover cell types, where cells in a cell type share mean expression profiles; (b) identify spot regions, where a region is a set of spots with the same cellular compositions; and (c) estimate cell-type proportions for each spot region. The Bayesian posterior inference is performed through a hybrid Markov chain Monte Carlo sampling algorithm. Extensive simulation studies and application to datasets on pancreatic ductal adenocarcinoma tissues demonstrate the practical utility of BEATS.