Abstract

Detecting saliency in videos is a fundamental step in many computer vision systems. Saliency is the significant target(s) in the video. The object of interest is further analyzed for high-level applications. The segregation of saliency and the background can be made if they exhibit different visual cues. Therefore, saliency detection is often formulated as background subtraction. However, saliency detection is challenging. For instance, dynamic background can result in false positive errors. In another scenario, camouflage will result in false negative errors. With moving cameras, the captured scenes are even more complicated to handle. We propose a new framework, called saliency detection via background model completion (SD-BMC), that comprises a background modeler and a deep learning background/foreground segmentation network. The background modeler generates an initial clean background image from a short image sequence. Based on the idea of video completion, a good background frame can be synthesized with the co-existence of changing background and moving objects. We adopt the background/foreground segmenter, which was pre-trained with a specific video dataset. It can also detect saliency in unseen videos. The background modeler can adjust the background image dynamically when the background/foreground segmenter output deteriorates during processing a long video. To the best of our knowledge, our framework is the first one to adopt video completion for background modeling and saliency detection in videos captured by moving cameras. The F-measure results, obtained from the pan-tilt-zoom (PTZ) videos, show that our proposed framework outperforms some deep learning-based background subtraction models by 11% or more. With more challenging videos, our framework also outperforms many high-ranking background subtraction methods by more than 3%.

Highlights

  • BSUV-Net 2.0 is enhanced with more training videos of moving camera, we find that the model still produces many false positive (FP) and false negative (FN) errors

  • In order to evaluate the performance of our framework, we compute eight quantitative measures: Recall, Specificity, False Positive Rate (FPR), False Negative Rate (FNR), Percentage of Wrong Classifications (PWC), F-Measure, Precision, and Matthew’s Correlation

  • We propose a new framework, saliency detection via background model completion (SD-BMC), for the detection of salient regions in each video frames

Read more

Summary

Introduction

With the estimated background scene model, the foreground (i.e., saliency) is segmented by a pixelwise background subtraction algorithm. Modelermodeling algorithms can estimate a clean background frame and background even the image sequence modeling contains moving objects. If the foreground exist lem more complicated withsuch video moving camera. Frame and erate the initial completed background frame: even the image sequence contains moving objects. If the foreground objects exist too long, there will be phenomena such as ghosts in the background image. Lem becomes more complicated with video captured by a moving camera. [28] proposed a variational autoencoder (VAE) framework for background estimation in the last part (d), the modeler will fix the remaining missing pixels with a number of from videos recorded by fixed camera. In our experimentation on videos of moving caminpainting iterations until there is no missing pixel

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call