To meet a myriad of customers’ demands, a large number of virtual machines (VMs) have to be provisioned simultaneously in cloud data centers. Provisioning is usually time consuming due to the large size of virtual machine image (VMI) file that needs to be transferred via networks. To address this issue, researchers attempt to leverage the content similarity among different VMI files to reduce the volume of transferred data. In the VM provisioning, the VM packing problem that minimizes the number of physical machines is another important issue. In this paper, our goal is to find a solution that tries to pack VMs to the minimum number of PMs as well as significantly reduces the total amount of transferred data. We formally define the problem of VM packing and minimizing the data transferring in the VM provisioning, named RTVD-VA. We first propose an approximation algorithm to minimize the amount of transferred VMI data when provisioning K VMs with the same size to a single physical machine. We then extend the algorithm to address the scenario of multiple PMs when using the minimum number of PMs. Based on the above two approximation algorithms, we propose a heuristic algorithm, namely Balance-Placement, to solve the problem in general cases. Our simulation results show that Balance-Placement outperforms existing solutions like PSO and Greedy-Cache and achieves the least amount of transferred data and the minimum number of used PMs in most scenarios.