Abstract
Object detection from images is generally achieved through a supervised learning manner. However, in many real applications, to provide instance level label is still costly. Thus, weakly supervised approach is proposed and naturally cast as a Multiple Instance Learning (MIL) problem. Traditional MIL methods typically learn discriminative classifiers from positive and negative training bags. Alternatively, we propose to select more discriminative instances for learning classifiers to further improve detection accuracy. With the candidate set of positive instances, we can also train a Smoothing Latent Support Vector Machine (SLSVM) to finally detect objects from a bag of instances. We observed that object instances of a common category are visually similar and when characterized as high-dimensional feature representations, they approximately lie in a low-dimensional subspace. Therefore, we propose a formulation optimizes a labeling variable for each positive image and learns the subspace model by minimizing rank (via convex surrogate function) of the coefficient matrix associated with the subspace model. To improve discriminative power, we also promote incoherence between the subspace model and some hard negative instances by utilizing a ε-insensitive loss. For this non-convex problem, we resort to block coordinate descent and Alternating Direction Method of Multipliers(ADMM) to get local optimal solutions. The promising empirical studies on real data sets demonstrate that our proposed method is superior to the state-of-the-art weakly supervised object detection approaches.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.