Due to radiometric and geometric distortions between images, mismatches are inevitable. Thus, a mismatch removal process is required for improving matching accuracy. Although deep learning methods have been proved to outperform handcraft methods in specific scenarios, including image identification and point cloud classification, most learning methods are supervised and are susceptible to incorrect labeling, and labeling data is a time-consuming task. This paper takes advantage of deep reinforcement leaning (DRL) and proposes a framework named unsupervised learning for mismatch removal (ULMR). Resorting to DRL, ULMR firstly scores each state–action pair guided by the output of classification network; then, it calculates the policy gradient of the expected reward; finally, through maximizing the expected reward of state–action pairings, the optimal network can be obtained. Compared to supervised learning methods (e.g., NM-Net and LFGC), unsupervised learning methods (e.g., ULCM), and handcraft methods (e.g., RANSAC, GMS), ULMR can obtain higher precision, more remaining correct matches, and fewer remaining false matches in testing experiments. Moreover, ULMR shows greater stability, better accuracy, and higher quality in application experiments, demonstrating reduced sampling times and higher compatibility with other classification networks in ablation experiments, indicating its great potential for further use.
Read full abstract