Open-world object detection: A solution based on reselection mechanism and feature disentanglement

  • Abstract
  • References
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Traditional object detection algorithms operate within a closed set, where the training data may not cover all real-world objects. Therefore, the issue of open-world object detection has attracted significant attention. Open-world object detection faces two major challenges: “neglecting unknown objects” and “misclassifying unknown objects as known ones.” In our study, we address these challenges by utilizing the Region Proposal Network (RPN) outputs to identify potential unknown objects with high object scores that do not overlap with ground truth annotations. We introduce the reselection mechanism, which separates unknown objects from the background. Subsequently, we employ the simulated annealing algorithm to disentangle features of unknown and known classes, guiding the detector’s learning process. Our method has improved on multiple evaluation metrics such as U-mAP, U-recall, and UDP, greatly alleviating the challenges faced by open world object detection.

ReferencesShowing 10 of 45 papers
  • Open Access Icon
  • Cite Count Icon 59
  • 10.1109/cvpr46437.2021.00156
Zero-shot Adversarial Quantization
  • Jun 1, 2021
  • Yuang Liu + 2 more

  • Open Access Icon
  • Cite Count Icon 58
  • 10.1109/cvpr52688.2022.00937
Expanding Low-Density Latent Regions for Open-Set Object Detection
  • Jun 1, 2022
  • Jiaming Han + 5 more

  • Open Access Icon
  • Cite Count Icon 90
  • 10.1145/3308558.3313644
Open-world Learning and Application to Product Classification
  • May 13, 2019
  • Hu Xu + 3 more

  • Cite Count Icon 1172
  • 10.1109/tpami.2012.256
Toward Open Set Recognition
  • Jul 1, 2013
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • W J Scheirer + 3 more

  • Cite Count Icon 17295
  • 10.1007/s11263-009-0275-4
The Pascal Visual Object Classes (VOC) Challenge
  • Sep 9, 2009
  • International Journal of Computer Vision
  • Mark Everingham + 4 more

  • Open Access Icon
  • Cite Count Icon 8748
  • 10.1007/978-3-030-58452-8_13
End-to-End Object Detection with Transformers
  • Jan 1, 2020
  • Nicolas Carion + 5 more

  • Cite Count Icon 2110
  • 10.1145/3386252
Generalizing from a Few Examples
  • Jun 12, 2020
  • ACM Computing Surveys
  • Yaqing Wang + 3 more

  • Open Access Icon
  • Cite Count Icon 936
  • 10.1109/cvpr.2018.00810
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
  • Jun 1, 2018
  • Arun Mallya + 1 more

  • Open Access Icon
  • Cite Count Icon 29870
  • 10.1007/978-3-319-10602-1_48
Microsoft COCO: Common Objects in Context
  • Jan 1, 2014
  • Tsung-Yi Lin + 7 more

  • Open Access Icon
  • Cite Count Icon 513
  • 10.1109/cvpr.2015.7298799
Towards Open World Recognition
  • Jun 1, 2015
  • Abhijit Bendale + 1 more

Similar Papers
  • Conference Article
  • 10.1109/iros47612.2022.9981886
New Objects on the Road? No Problem, We'll Learn Them Too
  • Oct 23, 2022
  • Deepak Kumar Singh + 7 more

Object detection plays an essential role in providing localization, path planning, and decision making capabilities in autonomous navigation systems. However, existing object detection models are trained and tested on a fixed number of known classes. This setting makes the object detection model difficult to generalize well in real-world road scenarios while encountering an unknown object. We address this problem by introducing our framework that handles the issue of unknown object detection and updates the model when unknown object labels are available. Next, our solution includes three major components that address the inherent problems present in the road scene datasets. The novel components are a) Feature-Mix that improves the unknown object detection by widening the gap between known and unknown classes in latent feature space, b) Focal regression loss handling the problem of improving small object detection and intra-class scale variation, and c) Curriculum learning further enhances the detection of small objects. We use Indian Driving Dataset (IDD) and Berkeley Deep Drive (BDD) dataset for evaluation. Our solution provides state-of-the-art performance on open-world evaluation metrics. We hope this work will create new directions for open-world object detection for road scenes, making it more reliable and robust autonomous systems.

  • Research Article
  • Cite Count Icon 1
  • 10.37256/aie.4220233058
A Framework for Open World Object Detection
  • Aug 22, 2023
  • Artificial Intelligence Evolution
  • Khadija Shaheen + 3 more

Open World Object Detection (OWOD) is a computer vision task that focuses on real-world scenarios where object detection algorithms need to not only detect known and labeled objects but also handle novel and unknown objects that were not seen during training. This distinguishes OWOD from traditional object detection benchmarks, where the scope is limited to detecting only known object classes. The main challenge in OWOD lies in detecting and classifying unknown objects, which were not part of the training data. In standard object detection, objects not overlapping with labeled objects are automatically classified as background. However, these approaches are not suitable for OWOD, as unknown objects may be wrongly predicted as background due to the lack of specific supervision for distinguishing unknown objects from the background. The paper proposes a novel framework for Open World Object Detection called Open World Object Detection based on Non-Parametric classification (OWOD-NP). This method aims to address the challenges of identifying unknown objects and extending the knowledge base by incrementally introducing new object categories. OWOD-NP incorporates a non-parametric learning approach based on mean prototypes and rejection criteria into a standard detector model. The non-parametric learning model allows the system to detect whether the perceived region contains an unknown object and perform incremental learning in an end-to-end manner. The extensive experiments conducted on the benchmark dataset of Pascal Visual Object Classes (VOC) validate the effectiveness of OWOD-NP. Compared to the standard faster RCNN model, OWOD-NP achieves approximately 14% higher mean Average Precision (mAP) in class incremental scenarios. This improvement showcases the capability of OWOD-NP to handle open-world object detection tasks more efficiently. By combining non-parametric learning with object detection, OWOD-NP provides a promising solution for open-world scenarios, where the environment is dynamic and new objects may appear over time. The ability to detect and classify both known and unknown objects makes OWOD-NP a valuable approach for real-world applications in robotics, autonomous systems, and other computer vision tasks. It allows for continuous adaptation and learning, enabling the system to extend its knowledge and cope with ever-changing environments effectively.

  • PDF Download Icon
  • Research Article
  • 10.3390/app132312806
A Parallel Open-World Object Detection Framework with Uncertainty Mitigation for Campus Monitoring
  • Nov 29, 2023
  • Applied Sciences
  • Jian Dong + 7 more

The recent advancements in artificial intelligence have brought about significant changes in education. In the context of intelligent campus development, target detection technology plays a pivotal role in applications such as campus environment monitoring and the facilitation of classroom behavior surveillance. However, traditional object detection methods face challenges in open and dynamic campus scenarios where unexpected objects and behaviors arise. Open-World Object Detection (OWOD) addresses this issue by enabling detectors to gradually learn and recognize unknown objects. Nevertheless, existing OWOD methods introduce two major uncertainties that limit the detection performance: the unknown discovery uncertainty from the manual generation of pseudo-labels for unknown objects and the known discrimination uncertainty from perturbations that unknown training introduces to the known class features. In this paper, we introduce a Parallel OWOD Framework with Uncertainty Mitigation to alleviate the unknown discovery uncertainty and the known discrimination uncertainty within the OWOD task. To address the unknown discovery uncertainty, we propose an objectness-driven discovery module to focus on capturing the generalized objectness shared among various known classes, driving the framework to discover more potential objects that are distinct from the background, including unknown objects. To mitigate the discrimination uncertainty, we decouple the learning processes for known and unknown classes through a parallel structure to reduce the mutual influence at the feature level and design a collaborative open-world classifier to achieve high-performance collaborative detection of both known and unknown classes. Our framework provides educators with a powerful tool for effective campus monitoring and classroom management. Experimental results on standard benchmarks demonstrate the framework’s superior performance compared to state-of-the-art methods, showcasing its transformative potential in intelligent educational environments.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3390/app13105896
A Few-Shot Defect Detection Method for Transmission Lines Based on Meta-Attention and Feature Reconstruction
  • May 10, 2023
  • Applied Sciences
  • Yundong Shi + 3 more

In tasks of transmission line defect detection, traditional object detection algorithms are ineffective, with few training samples of defective components. Meta-learning uses multi-task learning as well as fine-tuning to learn common features in different tasks, which has the ability to adapt to new tasks quickly, shows good performance in few-shot object detection, and has good generalization in new tasks. For this reason, we proposed a few-shot defect detection method (Meta PowerNet) with a Meta-attention RPN and Feature Reconstruction Module for transmission lines based on meta-learning. First, in the stage of region proposal, a new region proposal network (Meta-Attention Region Proposal Network, MA-RPN) is designed to fuse the support set features and the query set features to filter the noise in anchor boxes. In addition, it has the ability to focus on the subtle texture features of smaller-sized objects by fusing low-level features from the query set. Second, in the meta-feature construction stage, we designed a meta-learner with the defect feature reconstruction module as the core to capture and focus on the defect-related feature channels. The experimental results show that under the condition, there are only 30 training objects for various types of component defects. The method achieves 72.5% detection accuracy for component defects, which is a significant improvement compared with other mainstream few-shot object detection. Meanwhile, the MA-RPN designed in this paper can be used in other meta-learning object detection models universally.

  • Research Article
  • Cite Count Icon 392
  • 10.1007/s11042-020-08976-6
A review of object detection based on deep learning
  • Jun 12, 2020
  • Multimedia Tools and Applications
  • Youzi Xiao + 6 more

With the rapid development of deep learning techniques, deep convolutional neural networks (DCNNs) have become more important for object detection. Compared with traditional handcrafted feature-based methods, the deep learning-based object detection methods can learn both low-level and high-level image features. The image features learned through deep learning techniques are more representative than the handcrafted features. Therefore, this review paper focuses on the object detection algorithms based on deep convolutional neural networks, while the traditional object detection algorithms will be simply introduced as well. Through the review and analysis of deep learning-based object detection techniques in recent years, this work includes the following parts: backbone networks, loss functions and training strategies, classical object detection architectures, complex problems, datasets and evaluation metrics, applications and future development directions. We hope this review paper will be helpful for researchers in the field of object detection.

  • Conference Article
  • Cite Count Icon 1
  • 10.1117/12.2302964
Fast object detection algorithm based on HOG and CNN
  • Apr 10, 2018
  • Yanduo Zhang + 2 more

In the field of computer vision, object classification and object detection are widely used in many fields. The traditional object detection have two main problems:one is that sliding window of the regional selection strategy is high time complexity and have window redundancy. And the other one is that Robustness of the feature is not well. In order to solve those problems, Regional Proposal Network (RPN) is used to select candidate regions instead of selective search algorithm. Compared with traditional algorithms and selective search algorithms, RPN has higher efficiency and accuracy. We combine HOG feature and convolution neural network (CNN) to extract features. And we use SVM to classify. For TorontoNet, our algorithm's mAP is 1.6 percentage points higher. For OxfordNet, our algorithm's mAP is 1.3 percentage higher.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-031-21244-4_12
Uncertainty-Aware Deep Open-Set Object Detection
  • Jan 1, 2022
  • Qi Hang + 3 more

Open-set object detection better simulates the real world compared with close-set object detection. Besides the classes of interest, it also pays attention to unknown objects in the environment. We extend the previous concept of open-set object detection, aiming to detect both known and unknown objects. Because unknown objects have different textural features from known classes and the background, we assume that detecting unknown instances will generate high uncertainty. Therefore, in this paper, we propose an uncertainty-aware open-set object detection framework based on faster R-CNN. We introduce evidential deep learning to the field of object detection to estimate the uncertainty of the predictions and perform more accurate classification in open-set conditions. The obtained uncertainty will be utilized to pseudo-label unknown instances in the training data. We also introduce a contrastive clustering module to separate the feature representations of each class during the training phase. We set an uncertainty-based unknown identifier at the inference phase to enhance the generalization of the detector. We conduct experiments on three different data splits, and our method outperforms the recent SOTA method. We also demonstrate each component in our method is effective and indispensable in our ablation studies.KeywordsUncertaintyEvidential deep learningOpen-set object detection

  • Research Article
  • Cite Count Icon 2
  • 10.62762/tetai.2024.320179
Real-Time Object Detection Using a Lightweight Two-Stage Detection Network with Efficient Data Representation
  • Apr 20, 2024
  • IECE Transactions on Emerging Topics in Artificial Intelligence
  • Shaohuang Wang

In this paper, a novel fast object detection framework is introduced, designed to meet the needs of real-time applications such as autonomous driving and robot navigation. Traditional processing methods often trade off between accuracy and processing speed. To address this issue, a hybrid data representation method is proposed that combines the computational efficiency of voxelization with the detail capture capability of direct data processing to optimize overall performance. The detection framework comprises two main components: a Rapid Region Proposal Network (RPN) and a Refinement Detection Network (RefinerNet). The RPN is used to generate high-quality candidate regions, while the RefinerNet performs detailed analysis on these regions to improve detection accuracy. Additionally, a variety of network optimization techniques have been implemented, including lightweight network layers, network pruning, and model quantization, to increase processing speed and reduce computational resource consumption. Extensive testing on the KITTI and NEXET datasets has proven the effectiveness of this method in enhancing the accuracy of object detection and real-time processing speed. The experimental results show that, compared to existing technologies, this method performs exceptionally well across multiple evaluation metrics, especially in meeting the stringent requirements of real-time applications in terms of processing speed.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 66
  • 10.3390/rs14030536
Real-Time Detection of Full-Scale Forest Fire Smoke Based on Deep Convolution Neural Network
  • Jan 23, 2022
  • Remote Sensing
  • Xin Zheng + 4 more

To reduce the loss induced by forest fires, it is very important to detect the forest fire smoke in real time so that early and timely warning can be issued. Machine vision and image processing technology is widely used for detecting forest fire smoke. However, most of the traditional image detection algorithms require manual extraction of image features and, thus, are not real-time. This paper evaluates the effectiveness of using the deep convolutional neural network to detect forest fire smoke in real time. Several target detection deep convolutional neural network algorithms evaluated include the EfficientDet (EfficientDet: Scalable and Efficient Object Detection), Faster R-CNN (Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks), YOLOv3 (You Only Look Once V3), and SSD (Single Shot MultiBox Detector) advanced CNN (Convolutional Neural Networks) model. The YOLOv3 showed a detection speed up to 27 FPS, indicating it is a real-time smoke detector. By comparing these algorithms with the current existing forest fire smoke detection algorithms, it can be found that the deep convolutional neural network algorithms result in better smoke detection accuracy. In particular, the EfficientDet algorithm achieves an average detection accuracy of 95.7%, which is the best real-time forest fire smoke detection among the evaluated algorithms.

  • Conference Article
  • Cite Count Icon 13
  • 10.1109/itsc.2017.8317756
Saliency-guided region proposal network for CNN based object detection
  • Oct 1, 2017
  • Ann-Katrin Fattal + 3 more

Robust sensing of the environment is fundamental for driver assistance systems performing safe maneuvers. While approaches to object detection have experienced tremendous improvements since the introduction and combination of region proposal and convolutional neural networks in one framework, the detection of distant objects occupying just a few pixels in images can be challenging though. The convolutional and pooling layers reduce the image information to feature maps; yet, relevant information may be lost through pooling and convolution for small objects. In order to address this challenge, a new approach to proposing regions is presented that extends the architecture of a region proposal network by incorporating priors to guide the proposals towards regions containing potential target objects. Moreover, inspired by the concept of saliency, a saliency-based prior is chosen to guide the RPN towards important regions in order to make efficient use of differences between objects and background in an unsupervised fashion. This allows the network not only to consider local information provided by the convolutional layers, but also to take into account global information provided by the saliency priors. Experimental results based on a distant vehicle dataset and different configurations including three priors show that the incorporation of saliency-inspired priors into a region proposal network can improve its performance significantly.

  • Conference Article
  • Cite Count Icon 16
  • 10.1109/wacvw54805.2022.00030
Uncertainty Aware Proposal Segmentation for Unknown Object Detection
  • Jan 1, 2022
  • Yimeng Li + 1 more

Recent efforts in deploying Deep Neural Networks for object detection in real world applications, such as autonomous driving, assume that all relevant object classes have been observed during training. Quantifying the performance of these models in settings when the test data is not represented in the training set has mostly focused on pixel-level uncertainty estimation techniques of models trained for semantic segmentation. This paper proposes to exploit additional predictions of semantic segmentation models and quantifying its confidences, followed by classification of object hypotheses as known vs. unknown, out of distribution objects. We use object proposals generated by Region Proposal Network (RPN) and adapt distance aware uncertainty estimation of semantic segmentation using Radial Basis Functions Networks (RBFN) for class agnostic object mask prediction. The augmented object proposals are then used to train a classifier for known vs. unknown objects categories. Experimental results demonstrate that the proposed method achieves parallel performance to state of the art methods for unknown object detection and can also be used effectively for reducing object detectors’ false positive rate. Our method is well suited for applications where prediction of non-object background categories obtained by semantic segmentation is reliable.

  • Research Article
  • Cite Count Icon 25
  • 10.1016/j.neunet.2021.12.003
Feature Correlation-Steered Capsule Network for object detection
  • Dec 11, 2021
  • Neural Networks
  • Zhongqi Lin + 3 more

Feature Correlation-Steered Capsule Network for object detection

  • Conference Article
  • Cite Count Icon 133
  • 10.1109/cvpr52688.2022.00902
OW-DETR: Open-world Detection Transformer
  • Jun 1, 2022
  • Akshita Gupta + 5 more

Open-world object detection (OWOD) is a challenging computer vision problem, where the task is to detect a known set of object categories while simultaneously identifying unknown objects. Additionally, the model must incrementally learn new classes that become known in the next training episodes. Distinct from standard object detection, the OWOD setting poses significant challenges for generating quality candidate proposals on potentially unknown objects, separating the unknown objects from the background and detecting diverse unknown objects. Here, we introduce a novel end-to-end transformer-based framework, OW-DETR, for open-world object detection. The proposed OW-DETR comprises three dedicated components namely, attention-driven pseudo-labeling, novelty classification and objectness scoring to explicitly address the aforementioned OWOD challenges. Our OW-DETR explicitly encodes multi-scale contextual information, possesses less inductive bias, enables knowledge transfer from known classes to the unknown class and can better discriminate between unknown objects and background. Comprehensive experiments are performed on two benchmarks: MS-COCO and PASCAL VOC. The extensive ablations reveal the merits of our proposed contributions. Further, our model out-performs the recently introduced OWOD approach, ORE, with absolute gains ranging from 1.8% to 3.3% in terms of unknown recall on MS-COCO. In the case of incremental object detection, OW-DETR outperforms the state-of-the-art for all settings on PASCAL VOC. Our code is available at https://github.com/akshitac8/OW-DEtr.

  • Research Article
  • Cite Count Icon 3
  • 10.1145/3554923
Roadside Unit-based Unknown Object Detection in Adverse Weather Conditions for Smart Internet of Vehicles
  • Dec 31, 2022
  • ACM Transactions on Management Information Systems
  • Yu-Chia Chen + 2 more

For Internet of Vehicles applications, reliable autonomous driving systems usually perform the majority of their computations on the cloud due to the limited computing power of edge devices. The communication delay between cloud platforms and edge devices, however, can cause dangerous consequences, particularly for latency-sensitive object detection tasks. Object detection tasks are also vulnerable to significantly degraded model performance caused by unknown objects, which creates unsafe driving conditions. To address these problems, this study develops an orchestrated system that allows real-time object detection and incrementally learns unknown objects in a complex and dynamic environment. A you-only-look-once–based object detection model in edge computing mode uses thermal images to detect objects accurately in poor lighting conditions. In addition, an attention mechanism improves the system’s performance without significantly increasing model complexity. An unknown object detector automatically classifies and labels unknown objects without direct supervision on edge devices, while a roadside unit (RSU)-based mechanism is developed to update classes and ensure a secure driving experience for autonomous vehicles. Moreover, the interactions between edge devices, RSU servers, and the cloud are designed to allow efficient collaboration. The experimental results indicate that the proposed system learns uncategorized objects dynamically and detects instances accurately.

  • Research Article
  • Cite Count Icon 30175
  • 10.1109/tpami.2016.2577031
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
  • Jun 6, 2016
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Shaoqing Ren + 3 more

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features-using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3] , our detection system has a frame rate of 5 fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

More from: AI Communications
  • Research Article
  • 10.3233/aic-230270
Open-world object detection: A solution based on reselection mechanism and feature disentanglement
  • Sep 18, 2024
  • AI Communications
  • Tian Lin + 3 more

  • Research Article
  • Cite Count Icon 1
  • 10.3233/aic-230434
A diversity-aware recommendation system for tutoring
  • Sep 18, 2024
  • AI Communications
  • Laura Achón + 3 more

  • Open Access Icon
  • Research Article
  • Cite Count Icon 2
  • 10.3233/aic-230325
The CADE-29 Automated Theorem Proving System Competition – CASC-29
  • Sep 18, 2024
  • AI Communications
  • Geoff Sutcliffe + 1 more

  • Research Article
  • 10.3233/aic-230312
A multi-average based pseudo nearest neighbor classifier
  • Sep 18, 2024
  • AI Communications
  • Dapeng Li + 1 more

  • Research Article
  • 10.3233/aic-230053
Spatio-temporal deep learning framework for pedestrian intention prediction in urban traffic scenes
  • Sep 18, 2024
  • AI Communications
  • Monika + 2 more

  • Research Article
  • Cite Count Icon 4
  • 10.3233/aic-220247
Multimodal biometric authentication: A review
  • Sep 18, 2024
  • AI Communications
  • Swimpy Pahuja + 1 more

  • Research Article
  • Cite Count Icon 2
  • 10.3233/aic-230340
Residual SwinV2 transformer coordinate attention network for image super resolution
  • Sep 18, 2024
  • AI Communications
  • Yushi Lei + 4 more

  • Research Article
  • 10.3233/aic-230227
Multi-feature fusion dehazing based on CycleGAN
  • Sep 18, 2024
  • AI Communications
  • Jingpin Wang + 3 more

  • Research Article
  • 10.3233/aic-230154
Considerations on sentiment of social network posts as a feature of destructive impacts
  • Sep 18, 2024
  • AI Communications
  • Diana Levshun + 4 more

  • Research Article
  • Cite Count Icon 1
  • 10.3233/aic-230217
Second-order Spatial Measures Low Overlap Rate Point Cloud Registration Algorithm Based On FPFH Features1
  • Sep 18, 2024
  • AI Communications
  • Zewei Lian + 4 more

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon