Teacher–Student Model Using Grounding DINO and You Only Look Once for Multi-Sensor-Based Object Detection

Jinhwan Son,Heechul Jung

doi:10.3390/app14062232

Abstract

Object detection is a crucial research topic in the fields of computer vision and artificial intelligence, involving the identification and classification of objects within images. Recent advancements in deep learning technologies, such as YOLO (You Only Look Once), Faster-R-CNN, and SSDs (Single Shot Detectors), have demonstrated high performance in object detection. This study utilizes the YOLOv8 model for real-time object detection in environments requiring fast inference speeds, specifically in CCTV and automotive dashcam scenarios. Experiments were conducted using the ‘Multi-Image Identical Situation and Object Identification Data’ provided by AI Hub, consisting of multi-image datasets captured in identical situations using CCTV, dashcams, and smartphones. Object detection experiments were performed on three types of multi-image datasets captured in identical situations. Despite the utility of YOLO, there is a need for performance improvement in the AI Hub dataset. Grounding DINO, a zero-shot object detector with a high mAP performance, is employed. While efficient auto-labeling is possible with Grounding DINO, its processing speed is slower than YOLO, making it unsuitable for real-time object detection scenarios. This study conducts object detection experiments using publicly available labels and utilizes Grounding DINO as a teacher model for auto-labeling. The generated labels are then used to train YOLO as a student model, and performance is compared and analyzed. Experimental results demonstrate that using auto-generated labels for object detection does not lead to degradation in performance. The combination of auto-labeling and manual labeling significantly enhances performance. Additionally, an analysis of datasets containing data from various devices, including CCTV, dashcams, and smartphones, reveals the impact of different device types on the recognition accuracy for distinct devices. Through Grounding DINO, this study proves the efficacy of auto-labeling technology in contributing to efficiency and performance enhancement in the field of object detection, presenting practical applicability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Teacher–Student Model Using Grounding DINO and You Only Look Once for Multi-Sensor-Based Object Detection

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Journal: Applied Sciences	Publication Date: Mar 7, 2024
License type: CC BY 4.0

Similar Papers

Catalysing assistive solutions by deploying light-weight deep learning model on edge devices
Kanak Manjari ... Vinay Chamola
Journal of Experimental & Theoretical Artificial Intelligence | VOL. ahead-of-print
Kanak Manjari, et. al.Kanak Manjari ... Vinay Chamola
24 Jun 2023
Journal of Experimental & Theoretical Artificial Intelligence | VOL. ahead-of-print

A real-time object detection algorithm for video
Shengyu Lu ... Xiaoyan Zhang
Computers & Electrical Engineering | VOL. 77
Shengyu Lu, et. al.Shengyu Lu ... Xiaoyan Zhang
01 Jul 2019
Computers & Electrical Engineering | VOL. 77

Implementing Visual Assistant using YOLO and SSD for Visually-Impaired Persons
Ratnesh Litoriya ... Hany Jagwani
Journal of Automation, Mobile Robotics and Intelligent Systems | VOL. -
Ratnesh Litoriya, et. al.Ratnesh Litoriya ... Hany Jagwani
07 Mar 2024
Journal of Automation, Mobile Robotics and Intelligent Systems | VOL. -

YOLO with adaptive frame control for real-time object detection applications
Jeonghun Lee ... Kwang-Il Hwang
Multimedia Tools and Applications | VOL. 81
Jeonghun Lee, et. al.Jeonghun Lee ... Kwang-Il Hwang
18 Sep 2021
Multimedia Tools and Applications | VOL. 81

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Teacher–Student Model Using Grounding DINO and You Only Look Once for Multi-Sensor-Based Object Detection

Abstract

Talk to us

Similar Papers

More From: Applied Sciences