Many algorithms have been proposed for spatiotemporal image fusion on simulated data, yet only a few deal with spectral changes in real satellite images. An innovative spatiotemporal sparse representation (STSR) image fusion approach is introduced in this study to generate global dense high spatial and temporal resolution images from real satellite images. It aimed to minimize the data gap, especially when fine spatial resolution images are unavailable for a specific period. The proposed approach uses a set of real coarse- and fine-spatial resolution satellite images acquired simultaneously and another coarse image acquired at a different time to predict the corresponding unknown fine image. During the fusion process, pixels located between object classes with different spectral responses are more vulnerable to spectral distortion. Therefore, firstly, a rule-based fuzzy classification algorithm is used in STSR to classify input data and extract accurate edge candidates. Then, an object-based estimation of physical constraints and brightness shift between input data is utilized to construct the proposed sparse representation (SR) model that can deal with real input satellite images. Initial rules to adjust spatial covariance and equalize spectral response of object classes between input images are introduced as prior information to the model, followed by an optimization step to improve the STSR approach. The proposed method is applied to real fine Sentinel-2 and coarse Landsat-8 satellite data. The results showed that introducing objects in the fusion process improved spatial detail, especially over the edge candidates, and eliminated spectral distortion by preserving the spectral continuity of extracted objects. Experiments revealed the promising performance of the proposed object-based STSR image fusion approach based on its quantitative results, where it preserved almost 96.9% and 93.8% of the spectral detail over the smooth and urban areas, respectively.