Mid-level Visual Elements Research Articles

In this paper, we propose a new mid-level visual elements discovery method and apply it to the fine-grained classification. We present the duality between image patches and features extracted by the convolutional winner-take-all autoencoder (CONV-WTA-AE). The sparsity constraints used by CONV-WTA-AE make a group of objects sharing the same feature components. Hence, the image patches could be clustered by their sharing feature components and the feature components can be clustered by their co-occurrence in the image patches. We propose formulating the mid-level visual elements mining as a bipartite graph partitioning problem. The spectral partitioning algorithm is employed to co-cluster image patches and feature components. The CONV-WTA-AE is an unsupervised feature learning method. Hence, it avoids using expensive annotations. Our experiments demonstrate that the spectral partitioning method is very efficient but only the confident instances in a cluster are well discriminated. The similarity metric used by this algorithm is not accurate enough. Hence, we propose training a group of linear support vector machine (SVM) to refine the clustering results. These SVMs will be trained on the initial confident instances and provide a better discriminative similarity. Then we can re-assign instances to each clusters. To avoid overfitting, this process is iterated on many data subsets. We conduct a series of experiments on the MNIST dataset to verify our algorithm. The experimental results show that our method can discover meaningful image patch clusters. In the fine-grained classification task, visual elements are input into an ensemble of convolutional neural networks. The experiments on the CompCars dataset illustrate that our method can achieve the state-of-the-art performance.

Land-use classification using remote sensing images covers a wide range of applications. With more detailed spatial and textural information provided in very high resolution (VHR) remote sensing images, a greater range of objects and spatial patterns can be observed than ever before. This offers us a new opportunity for advancing the performance of land-use classification. In this paper, we first introduce an effective midlevel visual elementsoriented land-use classification method based on “partlets,” which are a library of pretrained part detectors used for midlevel visual elements discovery. Taking advantage of midlevel visual elements rather than low-level image features, a partlets-based method represents images by computing their responses to a large number of part detectors. As the number of part detectors grows, a main obstacle to the broader application of this method is its computational cost. To address this problem, we next propose a novel framework to train coarse-to-fine shared intermediate representations, which are termed “sparselets,” from a large number of pretrained part detectors. This is achieved by building a single-hidden-layer autoencoder and a single-hidden-layer neural network with an L0-norm sparsity constraint, respectively. Comprehensive evaluations on a publicly available 21-class VHR landuse data set and comparisons with state-of-the-art approaches demonstrate the effectiveness and superiority of this paper.

Mid-level Visual Elements Research Articles

Related Topics

Articles published on Mid-level Visual Elements

Deep sparse representation-based mid-level visual elements discovery in fine-grained classification

Learning discriminative visual elements using part-based convolutional neural network

Scene Recognition From Optical Remote Sensing Images Using Mid-Level Deep Feature Mining

Mining Mid-level Visual Patterns with Deep CNN Activations

Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Mid-level Visual Elements Research Articles

Related Topics

Articles published on Mid-level Visual Elements

Deep sparse representation-based mid-level visual elements discovery in fine-grained classification

Learning discriminative visual elements using part-based convolutional neural network

Scene Recognition From Optical Remote Sensing Images Using Mid-Level Deep Feature Mining

Mining Mid-level Visual Patterns with Deep CNN Activations

Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images