To promote the development of deep learning for feature matching, image registration, and three-dimensional reconstruction, we propose a method of constructing a deep learning benchmark dataset for affine-invariant feature matching. Existing images often have large viewpoint differences and areas with weak texture, which may cause difficulties for image matching, with respect to few matches, uneven distribution, and single matching texture. To solve this problem, we designed an algorithm for the automatic production of a benchmark dataset for affine-invariant feature matching. It combined two complementary algorithms, ASIFT (Affine-SIFT) and LoFTR (Local Feature Transformer), to significantly increase the types of matching patches and the number of matching features and generate quasi-dense matches. Optimized matches with uniform spatial distribution were obtained by the hybrid constraints of the neighborhood distance threshold and maximum information entropy. We applied this algorithm to the automatic construction of a dataset containing 20,000 images: 10,000 ground-based close-range images, 6000 satellite images, and 4000 aerial images. Each image had a resolution of 1024 × 1024 pixels and was composed of 128 pairs of corresponding patches, each with 64 × 64 pixels. Finally, we trained and tested the affine-invariant deep learning model, AffNet, separately on our dataset and the Brown dataset. The experimental results showed that the AffNet trained on our dataset had advantages, with respect to the number of matching points, match correct rate, and matching spatial distribution on stereo images with large viewpoint differences and weak texture. The results verified the effectiveness of the proposed algorithm and the superiority of our dataset. In the future, our dataset will continue to expand, and it is intended to become the most widely used benchmark dataset internationally for the deep learning of wide-baseline image matching.
Read full abstract