Abstract
To overcome shortages of conventional hand-crafted features, we propose a learning based feature extraction method for visual smoke recognition. We first slide a 3D sampling window in the scale space of images to densely compute 3D local differences across scales, and learn a projection matrix from all 3D local differences of training images. Then the projection matrix is used to transform 3D local differences of each image to generate feature maps, which naturally contain both local and holistic information. To further generate robust descriptors, we process these feature maps in two encoding ways: within-map and between-map encodings. The within-map encoding way generates an Local Binary Pattern (LBP) map for each feature map, while the between-map way encodes pixel-wise values across different feature maps to generate only one LBP map, denoted as a cross-sign map, for every eight feature maps. To make the two encoding ways have the same contributions, we weight the histograms of cross-sign maps and LBP maps with different coefficients. Each computational layer includes calculation of 3D local differences, learning of projection matrices, and encoding of projected features. Several computational layers can be stacked on top of each other to present a hierarchical structure to extract multi-order and high-level features. Subsequent layers carry higher order information on variations of local pixel values, which is more discriminative but also more sensitive to noise. To make a tradeoff between discriminative ability and noise suppression, Taylor-like coefficients are proposed to weight histograms computed from different computational layers. Experimental results demonstrate that the proposed method achieves better performance than most existing handcrafted methods and learning based feature extraction methods on both smoke recognition and texture classification.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have