We present a system which computes an integrated description of an object from multiple range images. The object description is in the form of B-rep (boundary representation), which has not been achieved by the computer vision community. To do so, we emphasize the inherent difficulties and ambiguities in the low to mid level vision, and present novel techniques of resolving them. In this system, each view of the object is represented as an attributed graph, where nodes correspond to the surfaces (vertices) and links represent the relationship between surfaces. The main issue in surface extraction is contour closure, which is formulated as a dynamic network. The underlying principle for this network is weak smoothness and geometric cohesion, and is modeled as the interaction between long and short term variables. Long term variables represent the initial boundary grouping computed from the low level surface features, and short term variables represent the competing hypotheses that cooperate with the long term variables. The matching problem involves matching visible surfaces and vertices, and provides the necessary basis for volumetric reconstruction from multiple views. The matching strategy is a two step process, where in each step uses the Hopfield network. At each step, we specify a set of local, adjacency and global constraints, and define an appropriate energy function to be minimized. At the first level of this hierarchy, surface patches are matched and the rigidity transformation is computed. At the second level, the mapping is refined by matching the corresponding vertices, and the transformation is verified. The multiple-view reconstruction consists of two steps. First, we build a composite graph that contains the bounding surfaces and their corresponding attributes, and then intersect these surfaces so that the edges and vertices corresponding to the B-rep description are identified. We present results on objects with planar, as well as quadratically-curved, surfaces.
Read full abstract