Summary Estimating residual oil saturation (Sor) post-waterflooding is critical for selecting enhanced oil recovery strategies, further field development, and production prediction. We established a data-driven workflow for evaluating Sor in carbonate samples using microcomputed tomography (μ-CT) images. The two-phase lattice Boltzmann method (LBM) facilitated the flooding simulation on 7,192 μ-CT samples. Petrophysical parameters (features) obtained from pore network modeling (PNM) and feature extraction from μ-CT images were utilized to develop tree-based regression models for predicting Sor. Petrophysical features include porosity, absolute permeability, initial water saturation (Swi), pore size distribution (PSD), throat size distributions (TSD), and surface roughness (Ra) distribution. Our method excludes vugs and macro/nanoporosity, which complicates multiscale simulations—a recognized challenge in modeling carbonate rocks. When subdividing the image into numerous subvolumes, certain subvolumes may contain vugs exceeding the dimensions of the subvolume itself. Hence, these vugs were omitted given the entirety of the image constitutes a vug. Conversely, vugs with dimensions smaller than those of the subvolume were not excluded. Despite scale limitations, our subsampling, supported by substantial data volume, ensures our microscale porosity predictions are statistically reliable, setting a foundation for future studies on vugs and nanoporosity’s impact on simulations. The results show that features obtained from dry-sample images can be used for data-driven Sor prediction. We tested three regression models: gradient boosting (GB), random forest (RF), and extreme gradient boosting (XGBoost). Among these, the optimized GB-based model demonstrated the highest predictive capacity for Sor prediction [R2 = 0.87, mean absolute error (MAE) = 1.87%, mean squared error (MSE) = 0.12%]. Increasing the data set size is anticipated to enhance the models’ ability to capture a broader spectrum of rock properties, thereby improving their prediction accuracy. The proposed predictive modeling framework for estimating Sor in heterogeneous carbonate formations aims to supplement conventional coreflooding tests or serve as a tool for rapid Sor evaluation of the reservoir.