Abstract

Automatic 3-D scene reconstruction is a useful technique in modern intelligent systems. Scene reconstruction from video sequences requires a selection of representative video frames. Most previous works employ the content-based techniques to automatically extract key frames. These methods take no frame geo-information into account and may be computationally heavy. In this paper, we propose a new key frame selection scheme based on the video geographic cues. Currently, an increasing number of user-generated videos are collected, which is a trend driven by the popularity of smartphones. In addition, it is convenient to acquire and fuse various sensor data (e.g., the geo-spatial metadata) for creating the geo-tagged mobile videos. Nowadays, large repositories of media content are automatically geo-tagged. Our proposed technique utilizes these underlying geo-metadata to select the most representative frames. We first eliminate irrelevant frames in which the candidate 3-D object does not appear. Then, a fixed number of key frames are selected. The criterion is that the selected key frames can maximally cover the candidate 3-D object/scene. Comprehensive experiments demonstrate the high quality of the reconstructed 3-D objects. Moreover, the execution time is reduced by 90%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call