Deep learning for object recognition: A comprehensive review of models and algorithms
Deep learning for object recognition: A comprehensive review of models and algorithms
- Research Article
1
- 10.3390/app151810255
- Sep 20, 2025
- Applied Sciences
The early detection of fire and smoke is essential for mitigating human casualties, property damage, and environmental impact. Traditional sensor-based and vision-based detection systems frequently exhibit high false alarm rates, delayed response times, and limited adaptability in complex or dynamic environments. Recent advances in deep learning and computer vision have enabled more accurate, real-time detection through the automated analysis of flame and smoke patterns. This paper presents a comprehensive review of deep learning techniques for fire and smoke detection, with a particular focus on convolutional neural networks (CNNs), object detection frameworks such as YOLO and Faster R-CNN, and spatiotemporal models for video-based analysis. We examine the benefits of these approaches in terms of improved accuracy, robustness, and deployment feasibility on resource-constrained platforms. Furthermore, we discuss current limitations, including the scarcity and diversity of annotated datasets, susceptibility to false alarms, and challenges in generalization across varying scenarios. Finally, we outline promising research directions, including multimodal sensor fusion, lightweight edge AI implementations, and the development of explainable deep learning models. By synthesizing recent advancements and identifying persistent challenges, this review provides a structured foundation for the design of next-generation intelligent fire detection systems.
- Supplementary Content
- 10.1016/j.omtn.2025.102691
- Aug 20, 2025
- Molecular Therapy. Nucleic Acids
Advancing CRISPR with deep learning: A comprehensive review of models and databases
- Supplementary Content
1
- 10.1016/j.csbj.2025.12.033
- Jan 1, 2026
- Computational and Structural Biotechnology Journal
Machine learning for drug-target interaction prediction: A comprehensive review of models, challenges, and computational strategies
- Research Article
78
- 10.1561/2000000071
- Jan 1, 2016
- Foundations and Trends® in Signal Processing
As a major breakthrough in artificial intelligence, deep learning has achieved very impressive success in solving grand challenges in many fields including speech recognition, natural language processing, computer vision, image and video processing, and multimedia. This article provides a historical overview of deep learning and focus on its applications in object recognition, detection, and segmentation, which are key challenges of computer vision and have numerous applications to images and videos. The discussed research topics on object recognition include image classification on ImageNet, face recognition, and video classification. The detection part covers general object detection on ImageNet, pedestrian detection, face landmark detection face alignment, and human landmark detection pose estimation. On the segmentation side, thearticle discusses the most recent progress on scene labeling, semantic segmentation, face parsing, human parsing and saliency detection. Object recognition is considered as whole-image classification, while detection and segmentation are pixelwise classification tasks. Their fundamental differences will be discussed in this article. Fully convolutional neural networks and highly efficient forward and backward propagation algorithms specially designed for pixelwise classification task will be introduced. The covered application domains are also much diversified. Human and face images have regular structures, while general object and scene images have much more complex variations in geometric structures and layout. Videos include the temporal dimension. Therefore, they need to be processed with different deep models. All the selected domain applications have received tremendous attentions in the computer vision and multimedia communities. Through concrete examples of these applications, we explain the key points which make deep learning outperform conventional computer vision systems. 1 Different than traditional pattern recognition systems, which heavily rely on manually designed features, deep learning automatically learns hierarchical feature representations from massive training data and disentangles hidden factors of input data through multi-level nonlinear mappings. 2 Different than existing pattern recognition systems which sequentially design or train their key components, deep learning is able to jointly optimize all the components and crate synergy through close interactions among them. 3 While most machine learning models can be approximated with neural networks with shallow structures, for some tasks, the expressive power of deep models increases exponentially as their architectures go deep. Deep models are especially good at learning global contextual feature representation with their deep structures. 4 Benefitting from the large learning capacity of deep models, some classical computer vision challenges can be recast as high-dimensional data transform problems and can be solved from new perspectives. Finally, some open questions and future works regarding to deep learning in object recognition, detection, and segmentation will be discussed.
- Research Article
- 10.47941/ijce.3153
- Aug 14, 2024
- International Journal of Computing and Engineering
Purpose: To aim of the study was to analyze the effectiveness of deep learning in object recognition for autonomous vehicles in Japan. Methodology: This study adopted a desk methodology. A desk study research design is commonly known as secondary data collection. This is basically collecting data from existing resources preferably because of its low cost advantage as compared to a field research. Our current study looked into already published studies and reports as the data was easily accessed through online journals and libraries. Findings: Deep learning has proven effective in object recognition for autonomous vehicles in Japan, particularly through advanced models like CNNs and YOLO. These technologies show high accuracy in detecting pedestrians, vehicles, and other objects, even in complex environments. Integrating sensor fusion (LiDAR, radar, cameras) enhances reliability in crowded urban areas. However, challenges remain, including data annotation, real-world conditions like narrow streets, and regulatory concerns. Despite these, ongoing advancements and collaborations suggest promising prospects for the future of autonomous vehicles in Japan. Unique Contribution to Theory, Practice and Policy: Technology acceptance model (TAM), diffusion of innovations (DOI) theory, systems theory may be used to anchor future studies on the effectiveness of deep learning in object recognition for autonomous vehicles in Japan. Practitioners should focus on improving the quality of labeled data for training purposes and employing transfer learning techniques to make models more adaptable to various situations. From a policy perspective, governments should establish clear safety standards and guidelines for the deployment of deep learning-based object recognition systems in autonomous vehicles.
- Research Article
1
- 10.1016/j.atech.2025.101400
- Dec 1, 2025
- Smart Agricultural Technology
Advances in UAV-based deep learning for cassava disease monitoring and detection: A comprehensive review of models, imaging techniques, and agricultural applications
- Research Article
- 10.1016/j.cie.2025.111725
- Feb 1, 2026
- Computers & Industrial Engineering
Deep learning approaches for weld defect detection: A comprehensive review of models, applications, and future directions
- Research Article
2
- 10.70937/jnes.v1i01.41
- Dec 17, 2024
- Innovatech Engineering Journal
Diabetes mellitus is a chronic metabolic disorder with significant global prevalence and associated healthcare burdens, necessitating early detection and effective management strategies. The integration of Machine Learning (ML) and Artificial Intelligence (AI) has revolutionized diabetes care, offering innovative approaches to prediction, monitoring, and personalized management. This study conducted a systematic review of 82 high-quality peer-reviewed articles, following the PRISMA guidelines, to provide a comprehensive evaluation of ML and AI applications in diabetes prediction and management. The review highlights the widespread adoption of supervised learning models, such as Random Forest and Support Vector Machines (SVM), which consistently demonstrate high accuracy and reliability in predicting diabetes risk. Ensemble learning methods, particularly Gradient Boosting, emerged as superior techniques for predictive performance, while deep learning models, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), proved effective in analyzing unstructured data such as medical images and time-series glucose data. The integration of AI into wearable devices and mobile health applications has further enhanced real-time monitoring and glycemic control, bridging the gap between technological advancements and practical healthcare solutions. Despite these advancements, challenges such as data imbalance, limited external validation, and the need for explainable AI frameworks persist, underscoring the necessity for methodological rigor and standardization. This review provides critical insights into the current state, limitations, and opportunities of ML and AI in diabetes care, emphasizing their transformative potential in addressing this global health challenge.
- Research Article
62
- 10.1016/j.psyneuen.2016.11.019
- Nov 24, 2016
- Psychoneuroendocrinology
Rapid effects of dorsal hippocampal G-protein coupled estrogen receptor on learning in female mice
- Research Article
147
- 10.1016/s0042-6989(99)00134-0
- Jan 27, 2000
- Vision Research
Perceptual learning in object recognition: object specificity and size invariance
- Research Article
30
- 10.1007/s00521-019-04200-1
- May 2, 2019
- Neural Computing and Applications
The article deals with the problems of motion detection, object recognition, and scene description using deep learning in the framework of granular computing and Z-numbers. Since deep learning is computationally intensive, whereas granular computing, on the other hand, leads to computation gain, a judicious integration of their merits is made so as to make the learning mechanism computationally efficient. Further, it is shown how the concept of z-numbers can be used to quantify the abstraction of semantic information in interpreting a scene, where subjectivity is of major concern, through recognition of its constituting objects. The system, thus developed, involves recognition of both static objects in the background and moving objects in foreground separately. Rough set theoretic granular computing is adopted where rough lower and upper approximations are used in defining object and background models. During deep learning, instead of scanning the entire image pixel by pixel in the convolution layer, we scan only the representative pixel of each granule. This results in a significant gain in computation time. Arbitrary-shaped and sized granules, as expected, perform better than regular-shaped rectangular granules or fixed-sized granules. The method of tracking is able to deal efficiently with various challenging cases, e.g., tracking partially overlapped objects and suddenly appeared objects. Overall, the granulated system shows a balanced trade-off between speed and accuracy as compared to pixel level learning in tracking and recognition. The concept of using Z-numbers, in providing a granulated linguistic description of a scene, is unique. This gives a more natural interpretation of object recognition in terms of certainty toward scene understanding.
- Research Article
- 10.32815/jpm.v5i1.1379
- Jun 6, 2024
- Jurnal Pengabdian Masyarakat
Purpose: This seminar and workshop aim to address the lack of understanding among students regarding object recognition with deep learning. By exploring the concepts and applications of deep learning in object detection and recognition, participants will gain insights into this crucial aspect of computer vision. Method: The event will feature lectures, practical demonstrations, and hands-on workshops conducted by experts in the field. Participants will engage in interactive sessions to deepen their understanding of convolutional neural networks and other deep learning techniques for object recognition. Practical Applications: The knowledge gained from this seminar and workshop will have practical implications across various industries, including autonomous vehicles, healthcare, security systems, and robotics. Participants will learn how to apply deep learning algorithms to solve real-world problems related to object detection and recognition. Conclusion: By the end of the seminar and workshop, participants are expected to have acquired a deeper understanding of object recognition with deep learning and its practical applications. This will contribute to bridging the gap between theoretical knowledge and real-world implementation in the field of computer vision.
- Book Chapter
- 10.1007/978-3-030-67148-8_7
- Jan 1, 2021
This paper presents the usage of deep learning in flue pipe type recognition. The main thesis is the possibility of recognizing the type of labium based on the sound generated by the flue pipe. For the purpose of our work, we prepared a large data set of high-quality recordings, carried out in an organbuilder’s workshop. Very high accuracy has been achieved in our experiments on these data using Artificial Neural Networks (ANN), trained to recognize the details of the pipe mouth construction. The organbuilders claim that they can distinguish the pipe mouth type only by hearing it, and this is why we decided to verify if it is possible to train ANN to recognize the details of the organ pipe, as this confirms a possibility that a human sense of hearing may be trained as well. In the future, the usage of deep learning in the recognition of pipe sound parameters may be used in the voicing of the pipe organ and the selection of appropriate parameters of pipes to obtain the desired timbre.KeywordsFlue pipeDeep learningLabium recognition
- Research Article
6
- 10.1016/j.engappai.2024.109565
- Nov 1, 2024
- Engineering Applications of Artificial Intelligence
The docking control system of an autonomous underwater vehicle combining intelligent object recognition and deep reinforcement learning
- Conference Article
2
- 10.1109/csci.2016.0160
- Dec 1, 2016
There has been an increase in the use of image processing for object recognition. However, traditional methods are not suitable in real-time system because they cannot satisfy human performance. Recently, deep learning with Convolutional Neural Network came to be known as a solution for image recognition. In fact, there are many great result with deep learning in object recognition. However, it needs a number of images to learn. In other words, it is necessary to manage images and categories. This paper proposes integrated object recognition system which manages and learns images. This system collects images automatically in classified categories and learns images in high accuracy. And multiple On-Board computer can share proposed learning system.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.