Spatiotemporal fusion (STF) is crucial for reconciling the conflict between temporal and spatial resolutions of remote sensing observations. However, fusing images in heterogeneous areas remains challenging under continuous missing values. Moreover, most current STF methods only consider temporal errors and disregard the spatial scale error in time variation. Therefore, we proposed a Quick Spatiotemporal Fusion with Coarse- and Fine-Resolution Scale Transformation Errors and Pixel-Based Synthesis Base Image Pair (STEPSBI). First, the optimal pixel-based image synthesis strategy was designed using all available fine- and coarse-resolution images. Then, the scale transformation error (STE) of coarse-resolution downscaling to fine-resolution in temporal variation was quantified. And a residual term was introduced to reduce the prediction error from the temporal variation. Finally, the spatial scale and temporal errors were corrected using the results of super-pixel segmentation as spatial weights. This model has two strengths: (1) Pixel-based image synthesis alleviates the absence of base images under continuous missing values; and (2) STE correction restores spatial details of heterogeneous areas in rapid land cover change. In scenarios of continuous missing images, abrupt land cover changes, and high-resolution heterogeneity, we evaluated STEPSBI, the Gap Filling and Savitzky–Golay filtering method (GF-SG), the Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM), the Flexible Spatiotemporal DAta Fusion (FSDAF), the cross-attention-based adaptive weighting fusion network (CAFE), and the Multi-scene Spatiotemporal Fusion Network (MUSTFN). The results indicate that STEPSBI yields better overall performance than other models in cropland, woodland, grassland, and other land cover types. Furthermore, ablation experiments demonstrated that each component improved the model’s performance. In addition, STEPSBI had higher fusion efficiency because it was developed on the Google Earth Engine cloud computing platform. Therefore, STEPSBI was feasible for advancing fine monitoring of spatiotemporal image fusion under continuous missing values and heterogeneous surfaces. The program of STEPBSI is freely available at: https://code.earthengine.google.com/684844aa42f64fa6b4eebe3bc0dd6483.