Spatiotemporal scene interpretation of space videos via deep neural network and tracklet analysis

Lichao Mou,Xiao Xiang Zhu

doi:10.1109/igarss.2016.7729468

Abstract

Spaceborne remote sensing videos are becoming indispensable resources, opening up opportunities for new remote sensing applications. To exploit this new type of data, we need sophisticated algorithms for semantic scene interpretation. The main difficulties are: 1) Due to the relatively poor spatial resolution of the video acquired from space, moving objects, like cars, are very difficult to detect, not to mention track; 2) camera movement handicaps scene interpretation. To address these challenges, in this paper we propose a novel framework that fuses multispectral images and space videos for spatiotemporal analysis. Taking a multispectral image and a spaceborne video as input, an innovative deep neural network is proposed to fuse them in order to achieve a fine-resolution spatial scene labeling map. Moreover, a sophisticated approach is proposed to analyze activities and estimate traffic density from 150,000+ tracklets produced by a Kanade-Lucas-Tomasi keypoint tracker. The proposed framework is validated using data provided for the 2016 IEEE GRSS data fusion contest, including a video acquired from the International Space Station and a DEIMOS-2 multispectral image. Both visual and quantitative analysis of the experimental results demonstrates the effectiveness of our approach.

Full Text