MVDet: multi-view multi-class object detection without ground plane assumption

Sola Park,Seungjin Yang,Hyuk-Jae Lee

doi:10.1007/s10044-023-01168-6

Sola Park, Seungjin Yang + Show 1 more

Open Access

https://doi.org/10.1007/s10044-023-01168-6

Copy DOI

Journal: Pattern Analysis and Applications	Publication Date: Jun 13, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Seoul National University

Abstract

Although many state-of-the-art methods of object detection in a single image have achieved great success in the last few years, they still suffer from the false positives in crowd scenes of the real-world applications like automatic checkout. In order to address the limitations of single-view object detection in complex scenes, we propose MVDet, an end-to-end learnable approach that can detect and re-identify multi-class objects in multiple images captured by multiple cameras (multi-view). Our approach is based on the premise that incorrect detection results in a specific view can be eliminated using precise cues from other views, given the availability of multi-view images. Unlike most existing multi-view detection algorithms, which assume that objects belong to a single class on the ground plane, our approach can classify multi-class objects without such assumptions and is thus more practical. To classify multi-class objects, we propose an integrated architecture for region proposal, re-identification, and classification. Additionally, we utilize the epipolar geometry constraint to devise a novel re-identification algorithm that does not require assumptions about ground plane assumption. Our model demonstrates competitive performance compared to several baselines on the challenging MessyTable dataset.

Full Text