Abstract

Author(s): Wu, Ying N; Zhu, Song-Chun; Guo, Cheng-en | Abstract: Computer vision can be considered a highly specialized data collection and data analysis problem. We need to understand the special properties of image data in order to construct statistical models for representing the wide variety of image patterns. One special property of vision that distinguishes itself from other sensory data such as speech data is that distance or scale plays a profound role in image data. More specifically, visual objects and patterns can appear at a wide range of distances or scales, and the same visual pattern appearing at different distances or scales produces different image data with different statistical properties, thus entails different regimes of statistical models. In particular, we show that the entropy rate of the image data changes over the viewing distance (as well as the camera resolution). Moreover, the inferential uncertainty changes with viewing distance too. We call these changes information scaling. From this perspective, we examine both empirically and theoretically two prominent and yet largely isolated research themes in image modeling literature, namely, wavelet sparse coding and Markov random fields. Our results indicate that the two models are appropriate on two different entropy regimes: sparse coding targets the low entropy regime, whereas the random fields are suitable for the high entropy regime. Because of information scaling, both models are necessary for representing and interpreting image intensity patterns in the whole entropy range, and information scaling triggers transitions between these two regimes of models. This motivates us to propose a full-zoom primal sketch model that integrates both sparse coding and Markov random fields. In this model, local image intensity patterns are classified into “sketchable regime” and “non-sketchable regime” by a sketchability criterion. In the sketchable regime, the image data are represented deterministically by highly parametrized sketch primitives. In the non-sketchable regime, the image data are characterized by Markov random fields whose sufficient statistics summarize computational results from failed attempts of sparse coding. The contribution of our work is two folded. First, information scaling provides a dimension to chart the space of natural images. Second, the full-zoom modeling scheme provides a natural integration of sparse coding and Markov random fields, thus enables us to develop a new and richer class of statistical models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call