Summary Diagenetic effects in carbonate rocks can enhance or occlude depositional pore space. Reliable identification of porosity-enhancing diagenetic features (e.g., vugs and fractures) is essential for petrophysical characterization of reservoir properties (e.g., porosity and permeability), construction of geological and reservoir models, reserve estimation, and production forecasting. Challenges remain in characterizing these diagenetic features from well logs as they are often mixed with variations in mineral and fluid concentrations. Herein, we explore a data-driven approach that is based on a comprehensive well log data set from the Arbuckle Formation in Kansas to classify vuggy facies in carbonate rocks. The available well log data include conventional logs (gamma ray (GRTC), resistivity (RT), neutron/density porosity (NPHI/RHOB), photoelectric factor (PE), and acoustic slowness) and nuclear magnetic resonance (NMR) transverse relaxation time (T2) logs. We parameterized the measured T2 distribution using a multimodal lognormal Gaussian density function and combined the resulting Gaussian parameters with conventional logs as inputs into three supervised machine learning (ML) algorithms; namely, support vector machine (SVM), random forest (RF), and artificial neural network (ANN). The facies labeling data used in this study were based on visual examination of vug sizes from core samples, which include five classes; namely, nonvuggy, pinpoint-size, centimeter-size, fist-size, and super-vuggy. In total, 80% of the data set was used as the training set, and a fivefold cross validation was used for hyperparameter tuning. We conducted a detailed comparison of the above three ML algorithms on the basis of different combinations of features. The highest classification accuracy achieved on the holdout testing set is 84% using SVM on a combination of conventional logs and selected NMR Gaussian parameters as inputs. In general, inclusion of conventional log data improves the prediction accuracy compared with using NMR data alone. Feature selection improves the performance for SVM and ANN but is not recommended for RF.