Monitoring the static and dynamic displacements of large engineering structures, such as buildings and bridges, can provide quantitative information for evaluating structural safety and maintenance purposes. Camera calibration is a key process in vision-based sensor systems for remote displacement measurement. Due to the large field of view of engineering structures, conventional camera calibration methods using precise calibration boards are difficult to apply. A modified calibration method for a binocular stereo vision system based on the epipolar constraint relationship is proposed to simplify the calibration process. Due to the absence of reference points in outdoor applications, an unmanned aerial vehicle that carries a reference marker is adopted. During its flight in the field, sequential images are captured simultaneously with the left and right imaging stations. An alternative to determine the scale factor is also proposed, which provides adequate precision for camera calibration. Two important issues are discussed, including the number of reference points and their selection with regards to the depth of view. The experimental results show that the proposed method is convenient to apply in outdoor situations and can achieve high accuracy in displacement measurement.