Wind tunnel testing of small-scale models using aerodynamic balances is a widely used and valuable methodology, particularly in pivotal sectors as wind energy and aerospace industry. Previous works introduced a novel concept of scalable, external, three-component strain gauge balance, which was further validated showing a remarkable agreement against existing experimental data. However, potential applications involving complex airfoil flow environments, or the examination of passive flow control devices are highly demanding in terms of precision, accuracy and uncertainty. This research presents a complete dataset from extensive measurements and explores different calibration methods to address these challenges. Special effort was placed on providing a detailed description of the methodology to ensure reproducibility, and the distinct requirements of each calibration method have been meticulously considered in a comprehensive comparison. The results lead to significant conclusions, indicating that exact solution linear calibration methods are adequate for some applications when suitable calibration loads are used. Nevertheless, for more advanced applications, third-order least-squares models offer the most accurate results and are thus recommended. Finally, this work identifies potential areas for further research, such as exploring double-sided calibration models and assessing the influence of model mounting on the balance behavior, which could contribute to advances in the field.