Data-driven machine learning force fields (MLFs) are more and more popular in atomistic simulations and exploit machine learning methods to predict energies and forces for unknown structures based on the knowledge learned from an existing reference database. The latter usually comes from density functional theory calculations. One main drawback of MLFs is that physical laws are not incorporated in the machine learning models, and instead, MLFs are designed to be very flexible to simulate complex quantum chemistry potential energy surface (PES). In general, MLFs have poor transferability, and hence, a very large trainset is required to span all the target feature space to get a reliable MLF. This procedure becomes more troublesome when the PES is complicated, with a large number of degrees of freedom, in which building a large database is inevitable and very expensive, especially when accurate but costly exchange-correlation functionals have to be used. In this manuscript, we exploit a high-dimensional neural network potential (HDNNP) on Pt clusters of sizes from 6 to 20 as one example. Our standard level of energy calculation is DFT GGA (PBE) using a plane wave basis set. We introduce an approximate but fast level with the PBE functional and a minimal atomic orbital basis set, and then, a more accurate but expensive level, using a hybrid functional or nonlocal vdW functional and a plane wave basis set, is reliably predicted by learning the difference with HDNNP. The results show that such a differential approach (named ΔHDNNP) can deliver very accurate predictions (error <10 meV/atom) in reference to converged basis set energies as well as more accurate but expensive xc functionals. The overall speedup can be as large as 900 for a 20 atom Pt cluster. More importantly, ΔHDNNP shows much better transferability due to the intrinsic smoothness of the delta potential energy surface, and accordingly, one can use much smaller trainset data to obtain better accuracy than the conventional HDNNP. A multilayer ΔHDNNP is thus proposed to obtain very accurate predictions versus expensive nonlocal vdW functional calculations in which the required trainset is further reduced. The approach can be easily generalized to any other machine learning methods and opens a path to study the structure and dynamics of Pt clusters and nanoparticles.