The main assumptions of this study were: (i) that spectrally similar soils are similar according to SOC content; and (ii) that using spectral-based clustering might enhance the quality of the prediction model. A total of 939 sample's soil spectral data and SOC content were selected from the regional soil Vis-NIR database of Dalmatia, Croatia. The spectral dataset was partitioned into distinct clusters using unsupervised learning (UL) techniques: Principal Component Analysis (PCA) and Fuzzy C-means (FCM) method. The ellipsoidal shape of the soil spectral data set was best separated into three clusters, indicating a reasonable quality of clustering as seen by the Silhouette score of 0.57. The most important wavelengths for clustering were at the following ranges: 1980–2025 nm, 1900, 735–775, 490–530, 1200, 2100, 2200, 2130, 2400 485, and 440 nm. The identified clusters had mean SOC content values of 22.25, 13.36, and 17.29 g C kg−1. At a significance level of 0.05, these values exhibit a statistically significant difference. The FCM clustering enhanced PLSR model's precision as measured by the Prediction Interval (PI). However, it had little impact on accuracy for SOC predictions as assessed using the Residual prediction deviation (RPD). Our models are suitable for performing approximate or preliminary screenings for the prediction of SOC content. Therefore, we recommend using UL techniques and chemometrics combined with Vis-NIR spectroscopy to supplement conventional SOC content laboratory analyses.