This study proposes a novel approach to provide a more granular poverty map in terms of coverage (up to a grid level with the spatial resolution of 1.5 km) with less cost and time to update to support better poverty monitoring. Two poverty estimation model development scenarios using machine learning and deep learning were evaluated in this study. First, the model is constructed using multisource satellite imagery and geospatial point of interest location of economic infrastructures using zonal statistics feature extraction, which is the proposed approach in this study. The specific multisource satellite imagery to be used as poverty indicators are night-time light intensity (NTL) as an approach to economic activity, normalized difference vegetation index (NDVI) as an approach to detect rural areas based on vegetation, built-up index (BUI) as an approach to detect urban areas based on building distribution, normalized difference water index (NDWI) to detect land cover, land surface temperature (LST) as an approach to detecting urban areas based on surface temperature, carbon monoxide (CO), nitrogen dioxide (NO2), and sulfur dioxide (SO2) as an approach to detecting economic activities based on pollution, as well as point of interest (POI) density, and POI distance which show the accessibility of an area. Second, the model is constructed from daytime multiband and nighttime light intensity satellite imagery using the deep learning architecture of Resnet-34 transfer learning feature extraction. In each scenario, we compared the performance of different machine learning (support vector regression/SVR, decision tree regression/DTR, and random forest regression/RFR) algorithms along with deep learning (multiple layer perceptron/MLP and one dimensional-convolutional neural network/CNN-1D) algorithms. Based on the results of the model development evaluation, the CNN-1D model was chosen as the best model in the first scenario, and the Resnet-34 + MLP model was chosen as the best model in the second scenario. The two best models from the first and second scenarios are then used to predict poverty at the grid level with a spatial resolution of 1.5 km. Based on official poverty data, the estimated poverty mapping built using the CNN-1D model in the first scenario was chosen as the best map with the value of root mean squared error (RMSE) 1.95 and adjusted R2 0.84 at the district level. The results of visual identification revealed that high estimates of poverty are typically found in sparsely inhabited areas surrounded by unoccupied land, which is usually an agricultural area. Otherwise, low poverty estimates, on the other hand, are more likely to be found in highly populated regions with convenient access. Our finding is aligned with the official report of Statistics Indonesia, which claims that poverty in rural regions is higher than in urban areas.