Image parsing with stochastic grammar: The Lotus Hill dataset and inference scheme

Benjamin Yao Benjamin Yao,Tianfu Wu Tianfu Wu,Xiong Yang Xiong Yang

doi:10.1109/cvprw.2009.5204331

Abstract

Summary form only given: We present the LHI dataset, a large-scale ground truth image dataset, and a top-down/bottom-up scheme for scheduling the inference processes in stochastic image grammar (SIG). Development of stochastic image grammar needs ground truth image data for diverse training and evaluation purposes, which can only be collected through manual annotation of thousands of images for a variety of object categories. This is too time-consuming a task for each research lab to do independently and a centralized general purpose ground truth dataset is much needed. In response to this need, the Lotus Hill Institute (LHI), an independent non-profit research institute in China, is founded in the summer of 2005. It has a full time annotation team for parsing the image structures and a development team for the annotation tools and database construction. Each image or object is parsed, semi-automatically, into a parse graph where the relations are specified and objects are named using the WordNet standard. The Lotus Hill Institute has now over 500,000 images (or video frames) parsed, covering 280 object categories. In computing, we present a method for scheduling bottom-up and top-down processes in image parsing with and-or graph (AoG) for advancing performance and speeding up on-line computation. For each node in an AoG, two types of bottom-up computing processes and one kind of top-down computing process are identified.

Full Text