Abstract
In this paper we consider the motion segmentation problem on sparse and unstructured datasets involving rigid motions, motivated by multibody structure from motion. In particular, we assume only two-frame correspondences as input without prior knowledge about trajectories. Inspired by the success of synchronization methods, we address this problem by introducing a two-stage approach: first, motion segmentation is addressed on image pairs independently; then, two-frame results are combined in a robust way to compute the final multi-frame segmentation. Our synthetic and real experiments demonstrate that the proposed approach is very effective in reducing the errors among two-frame results and it can cope with a large amount of mismatches. Moreover, our method can be profitably used to build a multibody structure from motion pipeline.
Highlights
Motion segmentation is a fundamental topic in Computer Vision and Robotic communities (Mattheus et al 2020), which is relevant in a variety of applications ranging from 3D reconstruction (Saputra et al 2018) to autonomous driving (Sabzevari and Scaramuzza 2016)
Trajectories can be eventually computed after motion segmentation: in this way we can focus on each moving object separately, exploiting single-body tools, resulting in more precise trajectories. This scenario will be analyzed in Section 6.5.2, where we show how to apply our framework to multibody structure from motion
The considerations made for the Hopkins datasets apply well to the MTPV62 benchmark: it is worth noting that our approach works under weaker assumptions than the best performing methods, being designed for motion segmentation with two-frame correspondences
Summary
Motion segmentation is a fundamental topic in Computer Vision and Robotic communities (Mattheus et al 2020), which is relevant in a variety of applications ranging from 3D reconstruction (Saputra et al 2018) to autonomous driving (Sabzevari and Scaramuzza 2016). Keuper et al 2015; Bideau and Learned-Miller 2016; Keuper 2017; Bideau et al 2018; Keuper et al 2020); other methods, instead, work with a sparse input (e.g., sparse key-points) and produce a sparse segmentation as output Vidal et al 2005; Li et al 2013; Ji et al 2014; Xu et al 2018; Arrigoni and Pajdla 2019a) The former are referred to as “video object segmentation” by some authors as they make use of temporal continuity between consecutive frames within a video, and they will be discussed in Sect.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have