OmniPD: One-Step Person Detection in Top-View Omnidirectional Indoor Scenes

Jingrui Yu,Roman Seidel,Gangolf Hirtz

doi:10.1515/cdbme-2019-0061

Abstract

Abstract We propose a one-step person detector for topview omnidirectional indoor scenes based on convolutional neural networks (CNNs). While state of the art person detectors reach competitive results on perspective images, missing CNN architectures as well as training data that follows the distortion of omnidirectional images makes current approaches not applicable to our data. The method predicts bounding boxes of multiple persons directly in omnidirectional images without perspective transformation, which reduces overhead of pre- and post-processing and enables realtime performance. The basic idea is to utilize transfer learning to fine-tune CNNs trained on perspective images with data augmentation techniques for detection in omnidirectional images. We fine-tune two variants of Single Shot MultiBox detectors (SSDs). The first one uses Mobilenet v1 FPN as feature extractor (moSSD). The second one uses ResNet50 v1 FPN (resSSD). Both models are pre-trained on Microsoft Common Objects in Context (COCO) dataset. We fine-tune both models on PASCAL VOC07 and VOC12 datasets, specifically on class person. Random 90-degree rotation and random vertical flipping are used for data augmentation in addition to the methods proposed by original SSD. We reach an average precision (AP) of 67.3%with moSSD and 74.9%with resSSD on the evaluation dataset. To enhance the fine-tuning process, we add a subset of HDA Person dataset and a subset of PIROPO database and reduce the number of perspective images to PASCAL VOC07. The AP rises to 83.2% for moSSD and 86.3% for resSSD, respectively. The average inference speed is 28 ms per image for moSSD and 38 ms per image for resSSD using Nvidia Quadro P6000. Our method is applicable to other CNN-based object detectors and can potentially generalize for detecting other objects in omnidirectional images.

Highlights

Convolutional neural networks (CNNs) were successfully investigated for several tasks in computer vision in the recent years
Since we have only one class with around 10,000 images in each training dataset, while the models are designed for larger tasks like Common Objects in Context (COCO) or PASCAL VOC, their capacities are much larger than what our data can provide
Towards the end of the training, the model severely overfits to the training dataset, resulting in a lower generalization ability

Summary

Introduction

Convolutional neural networks (CNNs) were successfully investigated for several tasks in computer vision in the recent years. The detection of objects in images belongs to these tasks. A main requirement for the detection of objects in images for current CNNs are accurate real-world training data. The object detection in indoor scenes with a limited number of image sensors can be reached with images from omnidirectional cameras. These cameras are suited for capturing one room with a single sensor due to a field of view of about 180°. Our goal is to detect objects in indoor scenes in omnidirectional image data with a detector trained on a mixture of perspective and omnidirectional images. Due to the lack of annotated training data for objects omnidirectional images we choose public available perspective datasets [1, 2] and sparsely available omnidirectional images with bounding box ground truth [3,4,5]

Objectives

Methods

Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Current Directions in Biomedical Engineering	Publication Date: Sep 1, 2019
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

OmniPD: One-Step Person Detection in Top-View Omnidirectional Indoor Scenes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Current Directions in Biomedical Engineering

Lead the way for us

Similar Papers

Closing the Gap on Addiction Recovery Engagement with an AI-infused Convolutional Neural Network Technology Application—A Design Vision
Benjamin Jacob ... Heather Mcdonald
American Journal of Neural Networks and Applications | VOL. 10
Benjamin Jacob, et. al.Benjamin Jacob ... Heather Mcdonald
07 Mar 2024
American Journal of Neural Networks and Applications | VOL. 10

DSAFF-Net: A Backbone Network Based on Mask R-CNN for Small Object Detection
Jian Peng ... Arun Kumar Sangaiah
Computers, Materials & Continua | VOL. 74
Jian Peng, et. al.Jian Peng ... Arun Kumar Sangaiah
01 Jan 2023
Computers, Materials & Continua | VOL. 74

Investigation of the possibility of using a convolutional neural network to detect the Sun in the mode of unstabilized motion of a nanosatellite
I.V Belokonov ... G Quijada P Jose
-
I.V Belokonov, et. al.I.V Belokonov ... G Quijada P Jose
30 May 2022
30 May 2022

A New Perspective for Mining COCO Dataset
-
Iraqi Journal of Computer, Communication, Control and System Engineering | VOL. -
--
28 Sep 2023
Iraqi Journal of Computer, Communication, Control and System Engineering | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

OmniPD: One-Step Person Detection in Top-View Omnidirectional Indoor Scenes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Current Directions in Biomedical Engineering