Spatial data are ubiquitous, massively collected, and widely used to support critical decision-making in many societal domains, including public health (e.g., COVID-19 pandemic control), agricultural crop monitoring, transportation, etc. While recent advances in machine learning and deep learning offer new promising ways to mine such rich datasets (e.g., satellite imagery, COVID statistics), spatial heterogeneity-an intrinsic characteristic embedded in spatial data-poses a major challenge as data distributions or generative processes often vary across space at different scales, with their spatial extents unknown. Recent studies (e.g., SVANN, spatial ensemble) targeting this difficult problem either require a known space-partitioning as the input, or can only support very limited number of partitions or classes (e.g., two) due to the decrease in training data size and the complexity of analysis. To address these limitations, we propose a model-agnostic framework to automatically transform a deep learning model into a spatial-heterogeneity-aware architecture, where the learning of arbitrary space partitionings is guided by a learning-engaged generalization of multivariate scan statistic and parameters are shared based on spatial relationships. Moreover, we propose a spatial moderator to generalize learned space partitionings to new test regions. Finally, we extend the framework by integrating meta-learning-based training strategies into both spatial transformation and moderation to enhance knowledge sharing and adaptation among different processes. Experiment results on real-world datasets show that the framework can effectively capture flexibly shaped heterogeneous footprints and substantially improve prediction performances.