LEOD-Net: Learning Line-Encoded Bounding Boxes for Real-Time Object Detection

Hatem Ibrahem,Ahmed Salem,Hyun-Soo Kang

doi:10.3390/s22103699

Hatem Ibrahem, Ahmed Salem + Show 1 more

Open Access

https://doi.org/10.3390/s22103699

Copy DOI

Journal: Sensors	Publication Date: May 12, 2022
Citations: 3	License type: CC BY 4.0

Affiliation: Chungbuk National University, Assiut University

Abstract

This paper proposes a learnable line encoding technique for bounding boxes commonly used in the object detection task. A bounding box is simply encoded using two main points: the top-left corner and the bottom-right corner of the bounding box; then, a lightweight convolutional neural network (CNN) is employed to learn the lines and propose high-resolution line masks for each category of classes using a pixel-shuffle operation. Post-processing is applied to the predicted line masks to filtrate them and estimate clear lines based on a progressive probabilistic Hough transform. The proposed method was trained and evaluated on two common object detection benchmarks: Pascal VOC2007 and MS-COCO2017. The proposed model attains high mean average precision (mAP) values (78.8% for VOC2007 and 48.1% for COCO2017) while processing each frame in a few milliseconds (37 ms for PASCAL VOC and 47 ms for COCO). The strength of the proposed method lies in its simplicity and ease of implementation unlike the recent state-of-the-art methods in object detection, which include complex processing pipelines.

Full Text