Abstract

Current two-stage object detectors extract the local visual features of Regions of Interest (RoIs) for object recognition and bounding-box regression. However, only using local visual features will lose global contextual dependencies, which are helpful to recognize objects with featureless appearances and restrain false detections. To tackle the problem, a simple framework, named Global Contextual Dependency Network (GCDN), is presented to enhance the classification ability of two-stage detectors. Our GCDN mainly consists of two components, Context Representation Module (CRM) and Context Dependency Module (CDM). Specifically, a CRM is proposed to construct multi-scale context representations. With CRM, contextual information can be fully explored at different scales. Moreover, the CDM is designed to capture global contextual dependencies. Our GCDN includes multiple CDMs. Each CDM utilizes local Region of Interest (RoI) features and single-scale context representation to generate single-scale contextual RoI features via the attention mechanism. Finally, the contextual RoI features generated by parallel CDMs independently are combined with the original RoI features to help classification. Experiments on MS-COCO 2017 benchmark dataset show that our approach brings continuous improvements for two-stage detectors.

Highlights

  • Internet 2022, 14, 27. https://doi.org/Object detection aims at locating and recognizing object instances from predefined object categories [1]

  • Our Global Contextual Dependency Network (GCDN) improves 1.5% and 1.2% Average Precision (AP) on MS-COCO 2017 benchmark dataset [11] with ResNet-50 for Feature Pyramid Network (FPN) and Mask

  • We present a novel Global Contextual Dependency Network (GCDN), as a plug-andplay component, to boost the classification ability of two-stage detectors; A Context Representation Module (CRM) is proposed to construct multi-scale context representations, and a Context Dependency Module (CDM) is designed to capture global contextual dependencies; Our proposed GCDN significantly improves detection performance and is easy to implement

Read more

Summary

Introduction

Object detection aims at locating and recognizing object instances from predefined object categories [1]. We present a simple Global Contextual Dependency Network (GCDN), which captures global contextual dependencies over local visual features to further enhance the local RoI feature representations. To capture global contextual dependencies, we utilize the attention mechanism and design a Context Dependency Module (CDM). Our GCDN consists of multiple CDMs. Each CDM generates single-scale contextual RoI features based on the local RoI features and singlescale context representation via affinity computation and context aggregation. We present a novel Global Contextual Dependency Network (GCDN), as a plug-andplay component, to boost the classification ability of two-stage detectors;. A Context Representation Module (CRM) is proposed to construct multi-scale context representations, and a Context Dependency Module (CDM) is designed to capture global contextual dependencies; Our proposed GCDN significantly improves detection performance and is easy to implement.

Object Detection
Context Dependency for Object Detection
Global Contextual Dependency Network
Context Representation Module
Context Dependency Module
Affinity Computation
Context Aggregation
Feature Fusion
Experiments
Implementation Details
Comparisons with Baselines
Context Operations
Pyramid Scales
Lite Version
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call