Machine learning potentials (MLPs) are widely applied as an efficient alternative way to represent potential energy surfaces (PESs) in many chemical simulations. The MLPs are often evaluated with the root-mean-square errors on the test set drawn from the same distribution as the training data. Here, we systematically investigate the relationship between such test errors and the simulation accuracy with MLPs on an example of a full-dimensional, global PES for the glycine amino acid. Our results show that the errors in the test set do not unambiguously reflect the MLP performance in different simulation tasks, such as relative conformer energies, barriers, vibrational levels, and zero-point vibrational energies. We also offer an easily accessible solution for improving the MLP quality in a simulation-oriented manner, yielding the most precise relative conformer energies and barriers. This solution also passed the stringent test by diffusion Monte Carlo simulations.