Deep learning for object recognition: A comprehensive review of models and algorithms

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Deep learning for object recognition: A comprehensive review of models and algorithms

Similar Papers
  • Research Article
  • Cite Count Icon 1
  • 10.3390/app151810255
Early Fire and Smoke Detection Using Deep Learning: A Comprehensive Review of Models, Datasets, and Challenges
  • Sep 20, 2025
  • Applied Sciences
  • Abdussalam Elhanashi + 3 more

The early detection of fire and smoke is essential for mitigating human casualties, property damage, and environmental impact. Traditional sensor-based and vision-based detection systems frequently exhibit high false alarm rates, delayed response times, and limited adaptability in complex or dynamic environments. Recent advances in deep learning and computer vision have enabled more accurate, real-time detection through the automated analysis of flame and smoke patterns. This paper presents a comprehensive review of deep learning techniques for fire and smoke detection, with a particular focus on convolutional neural networks (CNNs), object detection frameworks such as YOLO and Faster R-CNN, and spatiotemporal models for video-based analysis. We examine the benefits of these approaches in terms of improved accuracy, robustness, and deployment feasibility on resource-constrained platforms. Furthermore, we discuss current limitations, including the scarcity and diversity of annotated datasets, susceptibility to false alarms, and challenges in generalization across varying scenarios. Finally, we outline promising research directions, including multimodal sensor fusion, lightweight edge AI implementations, and the development of explainable deep learning models. By synthesizing recent advancements and identifying persistent challenges, this review provides a structured foundation for the design of next-generation intelligent fire detection systems.

  • Supplementary Content
  • 10.1016/j.omtn.2025.102691
Advancing CRISPR with deep learning: A comprehensive review of models and databases
  • Aug 20, 2025
  • Molecular Therapy. Nucleic Acids
  • Roghayyeh Alipanahi + 2 more

Advancing CRISPR with deep learning: A comprehensive review of models and databases

  • Supplementary Content
  • Cite Count Icon 1
  • 10.1016/j.csbj.2025.12.033
Machine learning for drug-target interaction prediction: A comprehensive review of models, challenges, and computational strategies
  • Jan 1, 2026
  • Computational and Structural Biotechnology Journal
  • Bilal Ahmad + 2 more

Machine learning for drug-target interaction prediction: A comprehensive review of models, challenges, and computational strategies

  • Research Article
  • Cite Count Icon 78
  • 10.1561/2000000071
Deep Learning in Object Recognition, Detection, and Segmentation
  • Jan 1, 2016
  • Foundations and Trends® in Signal Processing
  • Xiaogang Wang

As a major breakthrough in artificial intelligence, deep learning has achieved very impressive success in solving grand challenges in many fields including speech recognition, natural language processing, computer vision, image and video processing, and multimedia. This article provides a historical overview of deep learning and focus on its applications in object recognition, detection, and segmentation, which are key challenges of computer vision and have numerous applications to images and videos. The discussed research topics on object recognition include image classification on ImageNet, face recognition, and video classification. The detection part covers general object detection on ImageNet, pedestrian detection, face landmark detection face alignment, and human landmark detection pose estimation. On the segmentation side, thearticle discusses the most recent progress on scene labeling, semantic segmentation, face parsing, human parsing and saliency detection. Object recognition is considered as whole-image classification, while detection and segmentation are pixelwise classification tasks. Their fundamental differences will be discussed in this article. Fully convolutional neural networks and highly efficient forward and backward propagation algorithms specially designed for pixelwise classification task will be introduced. The covered application domains are also much diversified. Human and face images have regular structures, while general object and scene images have much more complex variations in geometric structures and layout. Videos include the temporal dimension. Therefore, they need to be processed with different deep models. All the selected domain applications have received tremendous attentions in the computer vision and multimedia communities. Through concrete examples of these applications, we explain the key points which make deep learning outperform conventional computer vision systems. 1 Different than traditional pattern recognition systems, which heavily rely on manually designed features, deep learning automatically learns hierarchical feature representations from massive training data and disentangles hidden factors of input data through multi-level nonlinear mappings. 2 Different than existing pattern recognition systems which sequentially design or train their key components, deep learning is able to jointly optimize all the components and crate synergy through close interactions among them. 3 While most machine learning models can be approximated with neural networks with shallow structures, for some tasks, the expressive power of deep models increases exponentially as their architectures go deep. Deep models are especially good at learning global contextual feature representation with their deep structures. 4 Benefitting from the large learning capacity of deep models, some classical computer vision challenges can be recast as high-dimensional data transform problems and can be solved from new perspectives. Finally, some open questions and future works regarding to deep learning in object recognition, detection, and segmentation will be discussed.

  • Research Article
  • 10.47941/ijce.3153
Effectiveness of Deep Learning in Object Recognition for Autonomous Vehicles in Japan
  • Aug 14, 2024
  • International Journal of Computing and Engineering
  • Emi Watanabe

Purpose: To aim of the study was to analyze the effectiveness of deep learning in object recognition for autonomous vehicles in Japan. Methodology: This study adopted a desk methodology. A desk study research design is commonly known as secondary data collection. This is basically collecting data from existing resources preferably because of its low cost advantage as compared to a field research. Our current study looked into already published studies and reports as the data was easily accessed through online journals and libraries. Findings: Deep learning has proven effective in object recognition for autonomous vehicles in Japan, particularly through advanced models like CNNs and YOLO. These technologies show high accuracy in detecting pedestrians, vehicles, and other objects, even in complex environments. Integrating sensor fusion (LiDAR, radar, cameras) enhances reliability in crowded urban areas. However, challenges remain, including data annotation, real-world conditions like narrow streets, and regulatory concerns. Despite these, ongoing advancements and collaborations suggest promising prospects for the future of autonomous vehicles in Japan. Unique Contribution to Theory, Practice and Policy: Technology acceptance model (TAM), diffusion of innovations (DOI) theory, systems theory may be used to anchor future studies on the effectiveness of deep learning in object recognition for autonomous vehicles in Japan. Practitioners should focus on improving the quality of labeled data for training purposes and employing transfer learning techniques to make models more adaptable to various situations. From a policy perspective, governments should establish clear safety standards and guidelines for the deployment of deep learning-based object recognition systems in autonomous vehicles.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.atech.2025.101400
Advances in UAV-based deep learning for cassava disease monitoring and detection: A comprehensive review of models, imaging techniques, and agricultural applications
  • Dec 1, 2025
  • Smart Agricultural Technology
  • Wasiu Akande Ahmed + 3 more

Advances in UAV-based deep learning for cassava disease monitoring and detection: A comprehensive review of models, imaging techniques, and agricultural applications

  • Research Article
  • 10.1016/j.cie.2025.111725
Deep learning approaches for weld defect detection: A comprehensive review of models, applications, and future directions
  • Feb 1, 2026
  • Computers & Industrial Engineering
  • Berkay Eren

Deep learning approaches for weld defect detection: A comprehensive review of models, applications, and future directions

  • Research Article
  • Cite Count Icon 2
  • 10.70937/jnes.v1i01.41
Machine Learning And Artificial Intelligence in Diabetes Prediction And Management: A Comprehensive Review of Models
  • Dec 17, 2024
  • Innovatech Engineering Journal
  • Md Ashraful Alam + 3 more

Diabetes mellitus is a chronic metabolic disorder with significant global prevalence and associated healthcare burdens, necessitating early detection and effective management strategies. The integration of Machine Learning (ML) and Artificial Intelligence (AI) has revolutionized diabetes care, offering innovative approaches to prediction, monitoring, and personalized management. This study conducted a systematic review of 82 high-quality peer-reviewed articles, following the PRISMA guidelines, to provide a comprehensive evaluation of ML and AI applications in diabetes prediction and management. The review highlights the widespread adoption of supervised learning models, such as Random Forest and Support Vector Machines (SVM), which consistently demonstrate high accuracy and reliability in predicting diabetes risk. Ensemble learning methods, particularly Gradient Boosting, emerged as superior techniques for predictive performance, while deep learning models, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), proved effective in analyzing unstructured data such as medical images and time-series glucose data. The integration of AI into wearable devices and mobile health applications has further enhanced real-time monitoring and glycemic control, bridging the gap between technological advancements and practical healthcare solutions. Despite these advancements, challenges such as data imbalance, limited external validation, and the need for explainable AI frameworks persist, underscoring the necessity for methodological rigor and standardization. This review provides critical insights into the current state, limitations, and opportunities of ML and AI in diabetes care, emphasizing their transformative potential in addressing this global health challenge.

  • Research Article
  • Cite Count Icon 62
  • 10.1016/j.psyneuen.2016.11.019
Rapid effects of dorsal hippocampal G-protein coupled estrogen receptor on learning in female mice
  • Nov 24, 2016
  • Psychoneuroendocrinology
  • Jennifer Lymer + 3 more

Rapid effects of dorsal hippocampal G-protein coupled estrogen receptor on learning in female mice

  • Research Article
  • Cite Count Icon 147
  • 10.1016/s0042-6989(99)00134-0
Perceptual learning in object recognition: object specificity and size invariance
  • Jan 27, 2000
  • Vision Research
  • Christopher S Furmanski + 1 more

Perceptual learning in object recognition: object specificity and size invariance

  • Research Article
  • Cite Count Icon 30
  • 10.1007/s00521-019-04200-1
Granulated deep learning and Z-numbers in motion detection and object recognition
  • May 2, 2019
  • Neural Computing and Applications
  • Sankar K Pal + 2 more

The article deals with the problems of motion detection, object recognition, and scene description using deep learning in the framework of granular computing and Z-numbers. Since deep learning is computationally intensive, whereas granular computing, on the other hand, leads to computation gain, a judicious integration of their merits is made so as to make the learning mechanism computationally efficient. Further, it is shown how the concept of z-numbers can be used to quantify the abstraction of semantic information in interpreting a scene, where subjectivity is of major concern, through recognition of its constituting objects. The system, thus developed, involves recognition of both static objects in the background and moving objects in foreground separately. Rough set theoretic granular computing is adopted where rough lower and upper approximations are used in defining object and background models. During deep learning, instead of scanning the entire image pixel by pixel in the convolution layer, we scan only the representative pixel of each granule. This results in a significant gain in computation time. Arbitrary-shaped and sized granules, as expected, perform better than regular-shaped rectangular granules or fixed-sized granules. The method of tracking is able to deal efficiently with various challenging cases, e.g., tracking partially overlapped objects and suddenly appeared objects. Overall, the granulated system shows a balanced trade-off between speed and accuracy as compared to pixel level learning in tracking and recognition. The concept of using Z-numbers, in providing a granulated linguistic description of a scene, is unique. This gives a more natural interpretation of object recognition in terms of certainty toward scene understanding.

  • PDF Download Icon
  • Research Article
  • 10.32815/jpm.v5i1.1379
Seminar and Workshop on Object Recognition using Deep Learning at Sam Ratulangi University Manado
  • Jun 6, 2024
  • Jurnal Pengabdian Masyarakat
  • Christine Dewi

Purpose: This seminar and workshop aim to address the lack of understanding among students regarding object recognition with deep learning. By exploring the concepts and applications of deep learning in object detection and recognition, participants will gain insights into this crucial aspect of computer vision. Method: The event will feature lectures, practical demonstrations, and hands-on workshops conducted by experts in the field. Participants will engage in interactive sessions to deepen their understanding of convolutional neural networks and other deep learning techniques for object recognition. Practical Applications: The knowledge gained from this seminar and workshop will have practical implications across various industries, including autonomous vehicles, healthcare, security systems, and robotics. Participants will learn how to apply deep learning algorithms to solve real-world problems related to object detection and recognition. Conclusion: By the end of the seminar and workshop, participants are expected to have acquired a deeper understanding of object recognition with deep learning and its practical applications. This will contribute to bridging the gap between theoretical knowledge and real-world implementation in the field of computer vision.

  • Book Chapter
  • 10.1007/978-3-030-67148-8_7
Recognition of the Flue Pipe Type Using Deep Learning
  • Jan 1, 2021
  • Damian Węgrzyn + 2 more

This paper presents the usage of deep learning in flue pipe type recognition. The main thesis is the possibility of recognizing the type of labium based on the sound generated by the flue pipe. For the purpose of our work, we prepared a large data set of high-quality recordings, carried out in an organbuilder’s workshop. Very high accuracy has been achieved in our experiments on these data using Artificial Neural Networks (ANN), trained to recognize the details of the pipe mouth construction. The organbuilders claim that they can distinguish the pipe mouth type only by hearing it, and this is why we decided to verify if it is possible to train ANN to recognize the details of the organ pipe, as this confirms a possibility that a human sense of hearing may be trained as well. In the future, the usage of deep learning in the recognition of pipe sound parameters may be used in the voicing of the pipe organ and the selection of appropriate parameters of pipes to obtain the desired timbre.KeywordsFlue pipeDeep learningLabium recognition

  • Research Article
  • Cite Count Icon 6
  • 10.1016/j.engappai.2024.109565
The docking control system of an autonomous underwater vehicle combining intelligent object recognition and deep reinforcement learning
  • Nov 1, 2024
  • Engineering Applications of Artificial Intelligence
  • Chao-Ming Yu + 1 more

The docking control system of an autonomous underwater vehicle combining intelligent object recognition and deep reinforcement learning

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/csci.2016.0160
Integrated Learning System for Object Recognition from Images Based on Convolutional Neural Network
  • Dec 1, 2016
  • Hyeok-June Jeong + 2 more

There has been an increase in the use of image processing for object recognition. However, traditional methods are not suitable in real-time system because they cannot satisfy human performance. Recently, deep learning with Convolutional Neural Network came to be known as a solution for image recognition. In fact, there are many great result with deep learning in object recognition. However, it needs a number of images to learn. In other words, it is necessary to manage images and categories. This paper proposes integrated object recognition system which manages and learns images. This system collects images automatically in classified categories and learns images in high accuracy. And multiple On-Board computer can share proposed learning system.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.