While data like HJ-1 CCD images have advantageous spatial characteristics for describing crop properties, the temporal resolution of the data is rather low, which can be easily made worse by cloud contamination. In contrast, although Moderate Resolution Imaging Spectroradiometer (MODIS) can only achieve a spatial resolution of 250 m in its normalised difference vegetation index (NDVI) product, it has a high temporal resolution, covering the Earth up to multiple times per day. To combine the high spatial resolution and high temporal resolution of different data sources, a new method (Spatial and Temporal Adaptive Vegetation index Fusion Model [STAVFM]) for blending NDVI of different spatial and temporal resolutions to produce high spatial–temporal resolution NDVI datasets was developed based on Spatial and Temporal Adaptive Reflectance Fusion Model (STARFM). STAVFM defines a time window according to the temporal variation of crops, takes crop phenophase into consideration and improves the temporal weighting algorithm. The result showed that the new method can combine the temporal information of MODIS NDVI and spatial difference information of HJ-1 CCD NDVI to generate an NDVI dataset with both high spatial and high temporal resolution. An application of the generated NDVI dataset in crop biomass estimation was provided. An average absolute error of 17.2% was achieved. The estimated winter wheat biomass correlated well with observed biomass (R 2 of 0.876). We conclude that the new dataset will improve the application of crop biomass estimation by describing the crop biomass accumulation in detail. There is potential to apply the approach in many other studies, including crop production estimation, crop growth monitoring and agricultural ecosystem carbon cycle research, which will contribute to the implementation of Digital Earth by describing land surface processes in detail.