Abstract

We analyze the amount of data needed to carry out various model-based recognition tasks in the context of a probabilistic data collection model. We focus on objects that may be described as semi-algebraic subsets of a Euclidean space. This is a very rich class that includes polynomially described bodies, as well as polygonal objects, as special cases. The class of object transformations considered is wide, and includes perspective and affine transformations of 2D objects, and perspective projections of 3D objects. We derive upper bounds on the number of data features (associated with non-zero spatial error) which provably suffice for drawing reliable conclusions. Our bounds are based on a quantitative analysis of the complexity of the hypotheses class that one has to choose from. Our central tool is the VC-dimension, which is a well-studied parameter measuring the combinatorial complexity of families of sets. It turns out that these bounds grow linearly with the task complexity, measured via the VC-dimension of the class of objects one deals with. We show that this VC-dimension is at most logarithmic in the algebraic complexity of the objects and in the cardinality of the model library. Our approach borrows from computational learning theory. Both learning and recognition use evidence to infer hypotheses but as far as we know, their similarity was not exploited previously. We draw close relations between recognition tasks and a certain learnability framework and then apply basic techniques of learnability theory to derive our sample size upper bounds. We believe that other relations between learning procedures and visual tasks exist and hope that this work will trigger further fruitful study along these lines.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.