Vanishing Point Detection and Rail Segmentation Based on Deep Multi-Task Learning

Xingxin Li,Yanqin Wan,Zujun Yu,Baoqing Guo,Liqiang Zhu

doi:10.1109/access.2020.3019318

Xingxin Li, Yanqin Wan + Show 3 more

Open Access

https://doi.org/10.1109/access.2020.3019318

Copy DOI

Abstract

In modern railway systems, video surveillance and machine vision analysis have been widely used to detect perimeter intrusions. For pan-tilt-zoom (PTZ) cameras, the machine vision system needs to detect adjustments in PTZ cameras and then automatically determine the new alarm region in real time. In this paper, we propose a deep multi-task learning based algorithm for simultaneous vanishing point (VP) detection and rail segmentation, which can identify camera adjustment from changes in VP, and then automatically determine the alarm region from segmented rails. The multi-task based neural network consists of a feature extraction base network and three sub-task networks. The first sub-task network is a convolution regression network for VP detection. The second sub-task network utilizes an encoder-decoder structure for vanishing region (VR, a fixed region centered on VP) segmentation. The third sub-task network shares the encoder-decoder structure with the VR segmentation task and is used for rail segmentation. The VR segmentation task is activated only at the training stage, serving as an auxiliary task to enhance feature learning ability and increase VP detection accuracy. To further improve the accuracies of VP detection and rail segmentation, low-level features is modulated by high-level semantic information before feeding to the decoder stage. With the help of shared feature extraction and auxiliary training, the proposed VP prediction method needs very small training dataset and outperforms other methods in both efficiency and accuracy.

Highlights

Video surveillance system (VSS) is an important subsystem in modern railways which are susceptible to many types of intruding foreign objects, e.g., trespassing passengers and terrorists, landslide or falling cargo from overhead bridge
Many cameras installed along railway lines are pan–tilt–zoom (PTZ) cameras and their monitoring scenes may be adjusted from time to time by different staff, it is desirable for VSS to be able to detect the change of monitoring scene and determine the alarm region automatically in real-time
We propose a deep multi-task learning framework for simultaneous vanishing point (VP) detection and rail segmentation

Summary

INTRODUCTION

Video surveillance system (VSS) is an important subsystem in modern railways which are susceptible to many types of intruding foreign objects, e.g., trespassing passengers and terrorists, landslide or falling cargo from overhead bridge. Complications arise due to the low image quality in unstable illumination conditions, non-Manhattan lines typical in curved railway sections and occlusion of running trains In these situations, directly predicting VP with handcrafted or CNN-outputted features is unstable and has poor accuracy, making the training of the underlying algorithm non-trivial. We propose a deep multi-task learning framework for simultaneous VP detection and rail segmentation. This framework can improve feature presentation and model generalization through complementary information from related tasks. The contributions of this paper include: 1) A deep multi-task learning framework integrating regression task and segmentation task is proposed to detect VP and rails only through a forward pass.

RELATED WORKS

VR SEGMENTATION

EXPERIMENTAL RESULTS AND MODEL ANALYSIS