Abstract

Abstract: This paper, proposes a weakly-supervised approach for three-dimensional object detection, which makes it possible to train a strong three-dimensional detector with position-level based annotations (i.e. an- notations pertaining to the centre of an object). In an attempt to rectify this information loss from box annotations to object centres, the proposed method, named Corporeality (referred to as BR in short in image and tabular representations) makes use of synthetic three-dimensional shapes to convert weak labels into completely annotated virtual scenes as stronger supervision, and then in turn utilizes these perfect virtual labels to complement and refine the original set of labels. The process involves the assemblage of three-dimensional shapes into physically reasonable virtual scenes according to the coarse scene layout extracted from position-level based annotations previously. Then we go back to reality by applying a virtual-to-real do-main adaptation function, which refines the weak labels along with supervising the three-dimensional detector’s training with the virtual scenes. This paper further proposes a more challenging benchmark for three-dimensional object detection with more diverse object sizes to better emphasize the potential of Corporeality. With an investment of a meagre 5% labelling labour, Corporeality was able to perform competitively when compared with some of the popular fully-supervised approaches out there, widely used with ScanNet datasets. Code is available at: https://github.com/mondalbidisha/Corporeality-BR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call