Abstract

In this paper, we propose a novel transformer-based end-to-end real-time vanishing point detection method, which is named Vanishing Point TRansformer (VPTR). The proposed method can directly regress the locations of vanishing points from given images. To achieve this goal, we pose vanishing point detection as a point object detection task on the Gaussian hemisphere with region division. Considering low-level features always provide more geometric information which can contribute to accurate vanishing point prediction, we propose a clear architecture where vanishing point queries in the decoder can directly gather multi-level features from CNN backbone with deformable attention in VPTR. Our method does not rely on line detection or Manhattan world assumption, which makes it more flexible to use. VPTR runs at an inferring speed of 140 FPS on one NVIDIA 3090 card. Experimental results on synthetic and real-world datasets demonstrate that our method can be used in both natural and structural scenes, and is superior to other state-of-the-art methods on the balance of accuracy and efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call