Mid-level Patches Research Articles

Effective extraction of human body parts and operated objects participating in action is the key issue of fine-grained action recognition. However, most of the existing methods require intensive manual annotation to train the detectors of these interaction components. In this paper, we represent videos by mid-level patches to avoid the manual annotation, where each patch corresponds to an action-related interaction component. In order to capture mid-level patches more exactly and rapidly, candidate motion regions are extracted by motion saliency. Firstly, the motion regions containing interaction components are segmented by a threshold adaptively calculated according to the saliency histogram of the motion saliency map. Secondly, we introduce a mid-level patch mining algorithm for interaction component detection, with object proposal generation and mid-level patch detection. The object proposal generation algorithm is used to obtain multi-granularity object proposals inspired by the idea of the Huffman algorithm. Based on these object proposals, the mid-level patch detectors are trained by K-means clustering and SVM. Finally, we build a fine-grained action recognition model using a graph structure to describe relationships between the mid-level patches. To recognize actions, the proposed model calculates the appearance and motion features of mid-level patches and the binary motion cooperation relationships between adjacent patches in the graph. Extensive experiments on the MPII cooking database demonstrate that the proposed method gains better results on fine-grained action recognition.

Read full abstract

We present a semi-supervised co-analysis method for learning 3D shape styles from projected feature lines , achieving style patch localization with only weak supervision. Given a collection of 3D shapes spanning multiple object categories and styles, we perform style co-analysis over projected feature lines of each 3D shape and then back-project the learned style features onto the 3D shapes. Our core analysis pipeline starts with mid-level patch sampling and pre-selection of candidate style patches. Projective features are then encoded via patch convolution. Multi-view feature integration and style clustering are carried out under the framework of partially shared latent factor (PSLF) learning, a multi-view feature learning scheme. PSLF achieves effective multi-view feature fusion by distilling and exploiting consistent and complementary feature information from multiple views, while also selecting style patches from the candidates. Our style analysis approach supports both unsupervised and semi-supervised analysis. For the latter, our method accepts both user-specified shape labels and style-ranked triplets as clustering constraints. We demonstrate results from 3D shape style analysis and patch localization as well as improvements over state-of-the-art methods. We also present several applications enabled by our style analysis.

Read full abstract

Mid-level Patches Research Articles

Related Topics

Articles published on Mid-level Patches

Fine-Grained Action Recognition by Motion Saliency and Mid-Level Patches

Semi-Supervised Co-Analysis of 3D Shape Styles from Projected Lines

Crowd Tracking by Group Structure Evolution

Dancelets Mining for Video Recommendation Based on Dance Styles

On Branded Handbag Recognition

Contour Detection-Based Discovery of Mid-Level Discriminative Patches for Scene Classification

Learning place-dependant features for long-term vision-based localisation

Scene Text Identification by Leveraging Mid-level Patches and Context Information

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Mid-level Patches Research Articles

Related Topics

Articles published on Mid-level Patches

Fine-Grained Action Recognition by Motion Saliency and Mid-Level Patches

Semi-Supervised Co-Analysis of 3D Shape Styles from Projected Lines

Crowd Tracking by Group Structure Evolution

Dancelets Mining for Video Recommendation Based on Dance Styles

On Branded Handbag Recognition

Contour Detection-Based Discovery of Mid-Level Discriminative Patches for Scene Classification

Learning place-dependant features for long-term vision-based localisation

Scene Text Identification by Leveraging Mid-level Patches and Context Information