Recently, analysis and decision-making based on spatiotemporal unmanned aerial vehicle (UAV) high-resolution imagery are gaining significant attention in smart agriculture. Constructing a spatiotemporal dataset requires multiple UAV image mosaics taken at different times. Because the weather or a UAV flight trajectory is subject to change when the images are taken, the mosaics are typically unaligned. This paper proposes a two-step approach, composed of global and local alignments, for spatiotemporal alignment of two wide-area UAV mosaics of high resolution. The first step, global alignment, finds a projection matrix that initially maps keypoints in the source mosaic onto matched counterparts in the target mosaic. The next step, local alignment, refines the result of the global alignment. The proposed method splits input mosaics into patches and applies individual transformations to each patch to enhance the remaining local misalignments at patch level. Such independent local alignments may result in new artifacts at patch boundaries. The proposed method uses a simple yet effective technique to suppress those artifacts without harming the benefit of the local alignment. Extensive experiments validate the proposed method by using several datasets for highland fields and plains in South Korea. Compared with a recent work, the proposed method improves the accuracy of alignment by up to 13.21% over the datasets.