We investigate the potential of near-infrared (NIR) spectroscopy to predict some heavy metals content (Zn, Cu, Pb, Cr and Ni) in several soil types in Stara Zagora Region, South Bulgaria, as affected by the size of calibration set using partial least squares (PLS) regression models. A total of 124 soil samples from the 0–20 and 20–40 cm layers were collected from fields with different cropping systems. Total Zn, Cu, Pb, Cr and Ni concentrations were determined by Atomic Absorption Spectrometry. Spectra of air dried soil samples were obtained using an FT-NIR Spectrometer (spectral range 700–2,500 nm). PLS calibration models were developed with full-cross-validation using calibration sets of 90 %, 80 %, 70 % and 60 % of the 124 samples. These models were validated with the same prediction set of 12 samples. The validation of the NIR models showed Cu to be best predicted with NIR spectroscopy. Less accurate prediction was observed for Zn, Pb and Ni, which was classified as possible to distinguish between high and low concentrations and as approximate quantitative. The worst model performance in cross-validation and prediction was for Cr. Results also showed that values of root mean square error in cross-validation (RMSEcv) increased with decreasing number of samples in calibration sets, which was particularly clear for Cu, Pb, Ni and Cr content. A similar tendency was observed in the prediction sets, where RMSEP values increased with a decrease in the number of samples, particularly for Pb, Ni and Cr content. This tendency was not clear for Zn, while even an increase in RMSEP for Cu with the sample size was observed. It can be concluded that NIR spectroscopy can be used to measure heavy metals in a sample set with different soil type, when sufficient number of soil samples (depending on variability) is used in the calibration set.