Abstract

Due to the factors like rapidly fast motion, cluttered backgrounds, arbitrary object appearance variation and shape deformation, an effective target representation plays a key role in robust visual tracking. Existing methods often employ bounding boxes for target representations, which are easily polluted by noisy clutter backgrounds that may cause drifting problem when the target undergoes large-scale non-rigid or articulated motions. To address this issue, in this paper, motivated by the spatio-temporal nonlocality of target appearance reoccurrence in a video, we explore the nonlocal information to accurately represent and segment the target, yielding an object likelihood map to regularize a correlation filter (CF) for visual tracking. Specifically, given a set of tracked target bounding boxes, we first generate a set of superpixels to represent the foreground and background, and then update the appearance of each superpixel with its long-term spatio-temporally nonlocal counterparts. Then, with the updated appearances, we formulate a spatio-temporally graphical model comprised of the superpixel label consistency potentials. Afterwards, we generate segmentation by optimizing the graphical model via iteratively updating the appearance model and estimating the labels. Finally, with the segmentation mask, we obtain an object likelihood map that is employed to adaptively regularize the CF learning by suppressing the clutter background noises while making full use of the long-term stable target appearance information. Extensive evaluations on the OTB50, SegTrack, Youtube-Objects datasets demonstrate the effectiveness of the proposed method, which performs favorably against some state-of-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.