For zoom micro-vision system containing electrically tunable lens (ETL), traditional calibration methods are usually ineffective or inefficient due to the limited field of view (FOV) and shallow depth of field (DOF). This article proposes a novel calibration method that can overcome these problems. Firstly, the geometric imaging model of the zoom micro-vision system considering the input focal power is established. Subsequently, a unique image acquisition scheme and the corresponding feature extraction algorithm developed in frequency domain are given, which can greatly reduce the calibration workload. In addition, how to calculate the initial value of the parameters in the imaging model and how to optimize these parameters globally are presented. Finally, calibration experiments including the comparisons with traditional multi-focus and fitting methods are designed and implemented. Experimental results show that the proposed method has good consistency and accuracy when calibrating at different focal intervals, and the average reprojection error is less than 0.25 pixels. Comparative experiments show that the traditional multi-focus and fitting methods not only need more calibration images and the auxiliary high-precision positioning stage, but also have poor applicability for zoom micro-vision systems in practical.