Valid simulation models play a critical role in enhancing efficiency in development processes and minimizing experimental efforts. As a result, ensuring accurate predictions through model discrimination, Model Calibration, and Validation (MoCaVal) has become increasingly important and is a necessary step to analyze the behavior and evaluate the effectiveness of systems during their initial design stages. Additionally, to achieve the climate objectives, it is imperative that heat pumps predominantly perform the building sector’s heating. The effectiveness of heat pumps hinges on the performance of the compressor. Therefore, there is a requirement for valid compressor models. However, currently, there is no universal compressor model that applies to all refrigerants and designs. Therefore, every refrigerant or design modification requires the development or adaptation of models. Since creating new models is a time-consuming procedure, automated techniques are advantageous.This paper compares two automated model development methods, Optimal Experimental Design (OED) and Machine Learning (ML), to create valid simulation models within the MoCaVal framework. OED is used to calibrate existing simulation models, while ML is employed to develop new models. Our comparison of the two methods focuses on a fixed-speed scroll compressor of 4 kW nominal power, using R410A and featuring 51 measurement points. Using the Full Factorial Plan (FFP) and the D-Optimal Experimental Design (D-OED), predefined models for mass flow rate, electrical power, isentropic and volumetric efficiency were calibrated. Employing the FFP, Machine Learning (ML) was also applied through support vector regression.The OED reduces experimental effort by 75–90 % compared to the FFP while only slightly increasing the average uncertainty by 0.30–0.79 %. For the FFP, MoCaVal achieved uncertainties of 0.52–3.9 % for calibrated models and 0.34–0.56 % with ML models. Therefore, within the design space, ML outperforms calibrated models based on the FFP and OED regarding uncertainty. However, since machine learning models were trained on the FFP, additional research is needed to demonstrate their potential for reducing test time and extrapolating results.