SeeSSD: Computational Storage for Energy-Efficient Real-Time Object Detection
In this work, we present our intelligent SSD, SeeSSD , an energy-efficient computational SSD for a real-time object detection system. SeeSSD embeds an FPGA-based CNN processing engine and the firmware that performs the convolutional operation on the target image. SeeSSD processes the image data at the storage before sending it to the host. This reduces the amount of data transferred to the host and lowers the data movement overhead, thus reducing transfer time and saving power. By using our SeeSSD system and YOLO_Embed, an object detection neural network model, we are able to outperform the fastest YOLO model for an embedded controller, YOLO-Lite, in terms of performance, accuracy, and energy efficiency. YOLO (You Only Look Once) models are a series of one-stage object detection neural models that have become very popular due to their fast speed and high accuracy. The contribution of this work includes designing and implementing our SeeSSD system with a lightweight object detection model, YOLO_Embed, for reducing the data movement overhead, performing real-time inference, and lowering the overall power consumption. We implemented the entire software stack associated with the SeeSSD system; on-device CNN acceleration engine implemented on FPGA, object identification interface for SeeSSD using YOLO_Embed, and embedded software layer in SeeSSD for on-device convolutional processing. We calculated our YOLO_Embed model’s accuracy on object detection dataset benchmarks such as PASCAL VOC 2012, which came out to be 38.1% mAP (mean Accuracy Precision). Our system was able to perform inference in 0.21 seconds while reducing the power consumption by approximately 1.2 × and 1.4 × for CPU-Only and CPU+GPU systems, respectively. We were also able to reduce the data movement overhead by 24 × for a single target image.
- Research Article
- 10.11591/ijece.v16i1.pp450-462
- Feb 1, 2026
- International Journal of Electrical and Computer Engineering (IJECE)
Object detection in images or videos faces several challenges because the detection must be accurate, efficient and fast. The you only look once (YOLO) algorithm was invented to meet these criteria. But with the creation of several versions of this algorithm (from V1 to V11), it becomes difficult for researchers to choose the best one. The main objective of this review is to present and compare the eleven versions of the yolo algorithm in order to know when using the appropriate one for the study. The methodology used for this work is aligned with preferred reporting items for systematic reviews and meta-analyses (PRISMA) principles and the results demonstrate that the choice of the best version mainly depends on the priorities of the study. If the study prioritizes accuracy and detection of small objects, it should use YOLO V4, YOLO V5, YOLO V6, YOLO V7, YOLO V8, YOLO V9, YOLO V10 or YOLO V11. While studies that prioritize detection speed should use YOLO V5, YOLO V6, YOLO V7, YOLO V8, YOLO V10 or YOLO V11. In complex environment, researchers should avoid using YOLO V1, YOLO V2, YOLO V3, YOLO V5, YOLO V7 and YOLO V9. And researchers who are looking for a good accuracy and speed and a reduced number of parameters should use YOLO V10 or YOLO V11.
- Research Article
- 10.52167/1609-1817-2024-134-5-239-246
- Oct 1, 2024
- Вестник КазАТК
In this study, we implemented YOLO (You Only Look Once) for real-time object detection and evaluated its performance based on key metrics such as processing speed, frame rate, and object detection accuracy. Our approach emphasizes both the precision and efficiency of YOLO, focusing on its ability to detect objects in real-world scenarios while maintaining a low computational cost. To identify and count objects, the YOLO algorithm was applied to analyze three images. It divided each image into a grid, and each cell predicted bounding boxes and confidence scores for potential objects. Following the processing of these predictions using non-max suppression to remove duplicates, each image contained an accurate count of the items that were detected. The model achieved a processing time of 17.68 seconds, with an average of 0.25 seconds per frame, demonstrating the system's capability for rapid object detection in near real-time applications. On average, 1.32 objects were detected per frame, with a maximum of 1.67 objects in a single frame and a minimum of 1 object per frame, indicating consistent detection across the dataset. The standard deviation of objects per frame (0.113) shows a low variability in object detection rates, reflecting the robustness of the model in handling diverse input frames. The achieved frame rate of 4.2 FPS demonstrates the model's potential for real-time applications, particularly in environments where processing speed is critical. The scientific novelty of this work lies in demonstrating YOLO’s adaptability for efficient object detection while maintaining high detection rates and consistent performance across varying scenarios. This study contributes to the field by showcasing YOLO's applicability in real-time systems, where object detection speed and accuracy are paramount. Our findings provide a foundation for further optimization in high-performance, low-latency object detection tasks, as well as its scalability for more complex detection systems. The results underscore YOLO's potential in both academic and industrial settings.
- Research Article
90
- 10.1007/s11042-021-11480-0
- Sep 18, 2021
- Multimedia Tools and Applications
You only look once (YOLO) is being used as the most popular object detection software in many intelligent video applications due to its ease of use and high object detection precision. In addition, in recent years, various intelligent vision systems based on high-performance embedded systems are being developed. Nevertheless, the YOLO still requires high-end hardware for successful real-time object detection. In this paper, we first discuss real-time object detection service of the YOLO on AI embedded systems with resource constraints. In particular, we point out the problems related to real-time processing in YOLO object detection associated with network cameras, and then propose a novel YOLO architecture with adaptive frame control (AFC) that can efficiently cope with these problems. Through various experiments, we show that the proposed AFC can maintain the high precision and convenience of YOLO, and provide real-time object detection service by minimizing total service delay, which remains a limitation of the pure YOLO.
- Research Article
- 10.55529/jipirs.34.27.35
- Jul 29, 2023
- Journal of Image Processing and Intelligent Remote Sensing
Artificial Intelligence and robotics the fields in which there is necessary required object detection algorithms. In this study, YOLO and different versions of YOLO are studied to find out advantages of each model as well as limitations of each model. Even in this study, YOLO version similarities and differences are studied. Improvement in the YOLO (You Only Look Once) as well as CNN (Convolutional Neural Network) is the research study present going on for different object detection. In this paper, each YOLO version model is discussed in detail with advantages, limitations and performance. YOLO updated versions such as YOLO v1, YOLO v2, YOLO v3, YOLO v4, YOLO v5 and YOLO v7 are studied and showed superior performance of YOLO v7 over other versions of YOLO algorithm.
- Research Article
- 10.52783/jisem.v10i25s.3927
- Mar 27, 2025
- Journal of Information Systems Engineering and Management
Object detection poses a complicated and pivotal task in computer imaginative and prescient, experiencing tremendous progress with the emergence of deep gaining knowledge of in current years. Researchers have drastically boosted the effectiveness of item detection and its associated obligations, including type, localization, and segmentation, through harnessing deep studying fashions. Object detectors are normally categorized into two groups: two-level detectors, which hire elaborate architectures to pay attention on selective region proposals, and single-level detectors, which make use of easier architectures to encompass all spatial areas for capability item detection in a single pass. The evaluation of item detectors predominantly revolves round detection accuracy and inference time. Despite two-degree detectors frequently accomplishing advanced accuracy, single-level detectors like YOLO (You Only Look Once) present faster inference speeds. The detection accuracy of YOLO has seen huge upgrades thru diverse architectural refinements, every so often even surpassing that of two-stage detectors. YOLO fashions are broadly embraced mainly because of their rapid inference talents. For example, while YOLO and Fast-RCNN exhibit detection accuracies of sixty three.Four and 70, respectively, YOLO's inference time is approximately 300 instances quicker. The YOLOv5 architecture, incorporating CSPDarknet53 because the spine, PANet for characteristic aggregation, and a detection head, enriches function extraction and fusion, rendering it quite green for actual-time packages. YOLOv5 introduces numerous enhancements, inclusive of stepped forward anchor boxes, superior statistics augmentation techniques, and automatic mixed precision education, aimed toward optimizing performance. Training typically includes sizable datasets like COCO or PASCAL VOC, with assessment carried out using metrics such as suggest Average Precision (mAP). YOLO's programs are diverse, spanning autonomous cars, surveillance, healthcare, and retail, underscoring its versatility. As research progresses, the integration of superior methodologies and enlargement into more tricky environments will similarly increase YOLO's capabilities, cementing its pivotal function in advancing actual-time item detection. In this investigation, we suggest an innovative methodology for object detection using YOLOv5 as the spine algorithm, with a specific consciousness on real-time car detection, demonstrating a significant contribution to the area of independent riding : and reaching noteworthy results with a validation mAP of zero.91. Introduction: Object detection, a essential mission in laptop imaginative and prescient, plays a important function in numerous domain names, consisting of independent riding systems, surveillance, and monitoring . Objectives: Lacinia at quis risus sed vulputate odio ut enim. Orci porta non pulvinar neque laoreet suspendisse interdum. Consequat mauris nunc congue nisi vitae suscipit. Morbi quis commodo odio aenean. Methods: At varius vel pharetra vel turpis nunc eget lorem. Feugiat scelerisque varius morbi enim nunc. Cras semper auctor neque vitae tempus quam pellentesque nec. Faucibus purus in massa tempor nec feugiat nisl. Congue nisi vitae suscipit tellus mauris a. Est sit amet facilisis magna etiam tempor. Dictum varius duis at consectetur. Purus semper eget duis at tellus at urna. Ipsum consequat nisl vel pretium. Viverra maecenas accumsan lacus vel facilisis volutpat est. Bibendum arcu vitae elementum curabitur vitae nunc sed. Nisl tincidunt eget nullam non nisi est. Ac turpis egestas integer eget aliquet nibh praesent. Results: Egestas diam in arcu cursus euismod quis viverra nibh. Convallis aenean et tortor at risus viverra. Sit amet justo donec enim diam. Sem et tortor consequat id. Purus gravida quis blandit turpis. Consectetur adipiscing elit duis tristique sollicitudin nibh sit amet commodo. Eget duis at tellus at urna condimentum mattis pellentesque. Auctor elit sed vulputate mi sit amet. Consequat ac felis donec et. In dictum non consectetur a erat nam at lectus. Dui vivamus arcu felis bibendum ut tristique. Lacinia quis vel eros donec ac. Ac turpis egestas maecenas pharetra convallis posuere morbi leo. Tortor id aliquet lectus proin. Conclusions: Mi tempus imperdiet nulla malesuada. Magna fermentum iaculis eu non diam phasellus vestibulum. Consectetur adipiscing elit duis tristique sollicitudin nibh sit amet commodo. Elit scelerisque mauris pellentesque pulvinar. Et malesuada fames ac turpis egestas maecenas pharetra convallis posuere. Elementum integer enim neque volutpat ac tincidunt vitae semper.
- Research Article
- 10.62762/tetai.2025.654854
- Oct 23, 2025
- ICCK Transactions on Emerging Topics in Artificial Intelligence
Object detection is a fundamental problem in computer vision, with applications spanning self-driving cars, surveillance systems, medical imaging, robotics, and smart cities. Among the myriad of algorithms developed for this task, the You Only Look Once (YOLO) family stands out for its ability to perform real-time and accurate object detection. This article provides a comprehensive analysis of the YOLO algorithm series, from YOLOv1 to YOLOv8, evaluating them across key performance metrics, including precision, recall, mean Average Precision (mAP), frames per second (FPS), and overall effectiveness. Unlike traditional two-stage detectors such as R-CNN, YOLO formulates object detection as a single regression problem: a single pass over the image simultaneously predicts bounding boxes and class probabilities. This end-to-end design enables YOLO to achieve high speed while maintaining a competitive accuracy-efficiency trade-off. We examine architectural innovations across YOLO versions, including batch normalization, anchor boxes, residual blocks, feature pyramid networks, and attention mechanisms, and discuss their impact on performance. Lightweight models (e.g., YOLOv5-Nano, YOLOv8-Small) are explored with a focus on their suitability for mobile and embedded systems, highlighting YOLO’s adaptability to resource-constrained environments. Challenges such as small object detection, occlusion, and domain-specific tuning are also addressed. This article serves as a practical guide for researchers, developers, and practitioners aiming to leverage YOLO for real-world object detection tasks.
- Research Article
7
- 10.3390/app14062232
- Mar 7, 2024
- Applied Sciences
Object detection is a crucial research topic in the fields of computer vision and artificial intelligence, involving the identification and classification of objects within images. Recent advancements in deep learning technologies, such as YOLO (You Only Look Once), Faster-R-CNN, and SSDs (Single Shot Detectors), have demonstrated high performance in object detection. This study utilizes the YOLOv8 model for real-time object detection in environments requiring fast inference speeds, specifically in CCTV and automotive dashcam scenarios. Experiments were conducted using the ‘Multi-Image Identical Situation and Object Identification Data’ provided by AI Hub, consisting of multi-image datasets captured in identical situations using CCTV, dashcams, and smartphones. Object detection experiments were performed on three types of multi-image datasets captured in identical situations. Despite the utility of YOLO, there is a need for performance improvement in the AI Hub dataset. Grounding DINO, a zero-shot object detector with a high mAP performance, is employed. While efficient auto-labeling is possible with Grounding DINO, its processing speed is slower than YOLO, making it unsuitable for real-time object detection scenarios. This study conducts object detection experiments using publicly available labels and utilizes Grounding DINO as a teacher model for auto-labeling. The generated labels are then used to train YOLO as a student model, and performance is compared and analyzed. Experimental results demonstrate that using auto-generated labels for object detection does not lead to degradation in performance. The combination of auto-labeling and manual labeling significantly enhances performance. Additionally, an analysis of datasets containing data from various devices, including CCTV, dashcams, and smartphones, reveals the impact of different device types on the recognition accuracy for distinct devices. Through Grounding DINO, this study proves the efficacy of auto-labeling technology in contributing to efficiency and performance enhancement in the field of object detection, presenting practical applicability.
- Research Article
16
- 10.1007/s10462-025-11253-3
- Jun 11, 2025
- Artificial Intelligence Review
This review systematically examines the progression of the You Only Look Once (YOLO) object detection algorithms from YOLOv1 to the recently unveiled YOLOv12. Employing a reverse chronological analysis, this study examines the advancements introduced by YOLO algorithms, beginning with YOLOv12 and progressing through YOLO11 (or YOLOv11), YOLOv10, YOLOv9, YOLOv8, and subsequent versions to explore each version’s contributions to enhancing speed, detection accuracy, and computational efficiency in real-time object detection. Additionally, this study reviews the alternative versions derived from YOLO architectural advancements of YOLO-NAS, YOLO-X, YOLO-R, DAMO-YOLO, and Gold-YOLO. Moreover, the study highlights the transformative impact of YOLO models across five critical application areas: autonomous vehicles and traffic safety, healthcare and medical imaging, industrial manufacturing, surveillance and security, and agriculture. By detailing the incremental technological advancements in subsequent YOLO versions, this review chronicles the evolution of YOLO, and discusses the challenges and limitations in each of the earlier versions. The evolution signifies a path towards integrating YOLO with multimodal, context-aware, and Artificial General Intelligence (AGI) systems for the next YOLO decade, promising significant implications for future developments in AI-driven applications.
- Research Article
- 10.47526/2024-3/2524-0080.10
- Sep 28, 2024
- Q A Iasaýı atyndaǵy Halyqaralyq qazaq-túrіk ýnıversıtetіnіń habarlary (fızıka matematıka ınformatıka serııasy)
In this article, we compare and analyze the YOLO (You Only Look Once) method, widely employed for object detection in digital image processing, with the Haar feature-based cascade classifier method implemented using the OpenCV library. YOLO, a deep learning–based approach, excels in real-time object detection and recognition applications. In contrast, the Haar method utilizes a traditional approach to rapidly identify features. However, significant performance differences exist between the two methods. Experimental results and performance analyses demonstrate that YOLO provides high accuracy rates and real-time processing speeds in object detection tasks. The code implementations presented in this study will be valuable to researchers new to digital image processing. Additionally, YOLO has shown high performance on large and complex datasets by leveraging GPU capabilities. Experiments with various YOLO versions (e.g., YOLOv4, YOLOv5, YOLOv7) have established it as one of the most suitable options for real-time applications, particularly due to its low latency and high accuracy.
- Research Article
1
- 10.11591/eei.v13i2.5698
- Apr 1, 2024
- Bulletin of Electrical Engineering and Informatics
Medical image examination with a deep learning approach is greatly beneficial in the healthcare industry for faster diagnosis and disease monitoring. One of the popular deep learning algorithms such as you only look once (YOLO) developed for object detection is a successful state-ofthe-art algorithm in real-time object detection systems. Although YOLO is continuously improving in the object detection area, there are still questions about how different YOLO versions compare in terms of performance. We utilize eight YOLO versions to classify acute myeloid leukaemia (AML) blood cells in image examinations. We also acquired the publicly available AML dataset from the cancer imaging archive (TCIA) which consists of expert-labeled single cell images. Data augmentation techniques are additionally applied to enhance and balance the training images in the dataset. The overall results indicated that eight types of YOLO approaches have outstanding performances of more than 90% in precision and sensitivity. In comparison, YOLOv4-tiny has a more reliable performance than the other seven approaches. Consistently, the YOLOv4-tiny also achieved the highest AUC score. Therefore, this work can potentially provide a beneficial digital rapid tool in the screening and evaluation of numerous haematological disorders.
- Research Article
- 10.62527/comien.2.3.66
- Sep 27, 2025
- International Journal on Computational Engineering
Face detection is a biometric technology used to identify individuals based on their facial features. However, this technology faces challenges when detecting faces from various angles, which can affect the accuracy and speed of detection. To address this issue, algorithms such as You Only Look Once (YOLO) and the Haar Cascade Classifier have been used for object detection. YOLO is a real-time object detection algorithm, while the Haar Cascade Classifier is a simpler method that uses Haar-like features to detect objects. Several previous studies have tested both algorithms for object detection such as vehicles and crowd counting, with results showing that YOLO offers higher accuracy. This study aims to analyze the performance of YOLO and the Haar Cascade Classifier in detecting faces from various angles. The test results show that YOLO can consistently detect faces with 100% accuracy in all conditions. Meanwhile, the Haar Cascade Classifier also shows high accuracy, but experiences a significant drop at extreme angles of -90°. When the face is smiling in normal lighting, its accuracy is 36.96% for testing on images and 42.81% for testing on videos. Although the Haar Cascade Classifier has faster detection times, YOLO still excels in detection accuracy and consistency. Therefore, the algorithm selection can be tailored to the system's needs—whether prioritizing processing speed or detection accuracy.
- Research Article
- 10.55041/isjem03657
- May 16, 2025
- International Scientific Journal of Engineering and Management
Abstract- With the rapid growth of urban populations and the increasing number of vehicles on the road, traffic congestion has become a significant problem in metropolitan areas. Conventional traffic control systems, which rely on pre-set signal timers, often fail to address real-time traffic conditions effectively, leading to inefficient traffic flow, increased fuel consumption, and environmental degradation. The paper “Smart Traffic Control System Based on Traffic Density Using YOLO” introduces a novel approach to address this issue by integrating artificial intelligence and computer vision into traffic management. The proposed system utilizes the YOLO (You Only Look Once) object detection algorithm to detect and classify vehicles in real-time from live video feeds captured at intersections. YOLO, known for its speed and accuracy in object detection tasks, calculates the vehicle density in each direction. Based on the detected density, the system dynamically adjusts the green light duration for each lane, ensuring that the direction with higher traffic receives longer green light periods. This method aims to optimize signal switching and minimize overall vehicle wait time. The paper outlines the complete architecture of the system, which includes modules for video input processing, vehicle detection using YOLO, traffic density computation, and adaptive signal timing control. By leveraging deep learning, the system enhances decision-making capabilities without relying on costly hardware like inductive loops or infrared sensors. This review evaluates the effectiveness and innovation of the proposed system, comparing it with traditional and sensor-based models. It highlights the advantages of using YOLO, such as real-time processing, high accuracy, and scalability. Additionally, the review discusses the system’s limitations, such as dependency on video clarity and environmental conditions, and proposes areas for future enhancement, including integration with cloud computing and edge AI for broader deployment. Keywords- Smart Traffic Control System, YOLO Object Detection, Traffic Density Estimation, Real-Time Traffic Monitoring, Adaptive Traffic Signal Control, Vehicle Classification ( Cars, Bikes, Buses, Trucks, Rikshaws), Intelligent Transportation System.
- Research Article
- 10.55041/ijsrem51169
- Jul 3, 2025
- INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
Object detection plays a vital role in enabling autonomous vehicles to perceive and interpret their surroundings accurately and in real time. This paper presents the application of the YOLO (You Only Look Once) object detection algorithm within the perception system of self-driving cars. YOLO is a deep learning- based approach that performs object detection as a single regression problem, allowing for high-speed and accurate identification of multiple objects, such as vehicles, pedestrians, traffic lights, and road signs, in a single frame. By integrating YOLO into the vehicle’s vision system, the autonomous system can make timely decisions to ensure safe navigation and obstacle avoidance. This study discusses the architecture of YOLO, its real-time performance capabilities, and its effectiveness in dynamic driving environments. Experimental results demonstrate that YOLO provides a practical solution for real-time object detection in autonomous driving systems, offering a balanced trade-off between speed and accuracy necessary for safe and efficient operation.
- Research Article
93
- 10.1016/j.compeleceng.2019.05.009
- Jul 1, 2019
- Computers & Electrical Engineering
A real-time object detection algorithm for video
- Research Article
122
- 10.1016/j.compag.2024.109090
- May 31, 2024
- Computers and Electronics in Agriculture
Agricultural object detection with You Only Look Once (YOLO) Algorithm: A bibliometric and systematic literature review
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.