Recently, the Isomap procedure [10] was proposed as a new way to recover a low-dimensional parametrization of data lying on a low-dimensional submanifold in high-dimensional space. The method assumes that the submanifold, viewed as a Riemannian submanifold of the ambient high-dimensional space, is isometric to a convex subset of Euclidean space. This naturally raises the question: what datasets can reasonably be modeled by this condition? In this paper, we consider a special kind of image data: families of images generated by articulation of one or several objects in a scene--for example, images of a black disk on a white background with center placed at a range of locations. The collection of all images in such an articulation family, as the parameters of the articulation vary, makes up an articulation manifold, a submanifold of L2. We study the properties of such articulation manifolds, in particular, their lack of differentiability when the images have edges. Under these conditions, we show that there exists a natural renormalization of geodesic distance which yields a well-defined metric. We exhibit a list of articulation models where the corresponding manifold equipped with this new metric is indeed isometric to a convex subset of Euclidean space. Examples include translations of a symmetric object, rotations of a closed set, articulations of a horizon, and expressions of a cartoon face. The theoretical predictions from our study are borne out by empirical experiments with published Isomap code. We also note that in the case where several components of the image articulate independently, isometry may fail; for example, with several disks in an image avoiding contact, the underlying Riemannian manifold is locally isometric to an open, connected, but not convex subset of Euclidean space. Such a situation matches the assumptions of our recently-proposed Hessian Eigenmaps procedure, but not the original Isomap procedure.
Read full abstract