The performance of matching and object recognition methods based on interest points depends on both the properties of the underlying interest points and the choice of associated image descriptors. This paper demonstrates advantages of using generalized scale-space interest point detectors in this context for selecting a sparse set of points for computing image descriptors for image-based matching. For detecting interest points at any given scale, we make use of the Laplacian $$\nabla ^2_{norm} L$$?norm2L, the determinant of the Hessian $$\det {\mathcal {H}}_{norm} L$$detHnormL and four new unsigned or signed Hessian feature strength measures $${\mathcal {D}}_{1,norm} L$$D1,normL, $$\tilde{\mathcal {D}}_{1,norm} L$$D~1,normL, $${\mathcal {D}}_{2,norm} L$$D2,normL and $$\tilde{\mathcal {D}}_{2,norm} L$$D~2,normL, which are defined by generalizing the definitions of the Harris and Shi-and-Tomasi operators from the second moment matrix to the Hessian matrix. Then, feature selection over different scales is performed either by scale selection from local extrema over scale of scale-normalized derivates or by linking features over scale into feature trajectories and computing a significance measure from an integrated measure of normalized feature strength over scale. A theoretical analysis is presented of the robustness of the differential entities underlying these interest points under image deformations, in terms of invariance properties under affine image deformations or approximations thereof. Disregarding the effect of the rotationally symmetric scale-space smoothing operation, the determinant of the Hessian $$\det {\mathcal {H}}_{norm} L$$detHnormL is a truly affine covariant differential entity and the Hessian feature strength measures $${\mathcal {D}}_{1,norm} L$$D1,normL and $$\tilde{\mathcal {D}}_{1,norm} L$$D~1,normL have a major contribution from the affine covariant determinant of the Hessian, implying that local extrema of these differential entities will be more robust under affine image deformations than local extrema of the Laplacian operator or the Hessian feature strength measures $${\mathcal {D}}_{2,norm} L$$D2,normL, $$\tilde{\mathcal {D}}_{2,norm} L$$D~2,normL. It is shown how these generalized scale-space interest points allow for a higher ratio of correct matches and a lower ratio of false matches compared to previously known interest point detectors within the same class. The best results are obtained using interest points computed with scale linking and with the new Hessian feature strength measures $${\mathcal {D}}_{1,norm} L$$D1,normL, $$\tilde{\mathcal {D}}_{1,norm} L$$D~1,normL and the determinant of the Hessian $$\det {\mathcal {H}}_{norm} L$$detHnormL being the differential entities that lead to the best matching performance under perspective image transformations with significant foreshortening, and better than the more commonly used Laplacian operator, its difference-of-Gaussians approximation or the Harris---Laplace operator. We propose that these generalized scale-space interest points, when accompanied by associated local scale-invariant image descriptors, should allow for better performance of interest point based methods for image-based matching, object recognition and related visual tasks.
Read full abstract