Interactive Streaming of 3D Scenes to Mobile Devices using Dual-Layer Image Warping and Loop-based Depth Reconstruction

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Interactive Streaming of 3D Scenes to Mobile Devices using Dual-Layer Image Warping and Loop-based Depth Reconstruction

Similar Papers
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 15
  • 10.1371/journal.pone.0055586
Remote Dynamic Three-Dimensional Scene Reconstruction
  • May 7, 2013
  • PLoS ONE
  • You Yang + 3 more

Remote dynamic three-dimensional (3D) scene reconstruction renders the motion structure of a 3D scene remotely by means of both the color video and the corresponding depth maps. It has shown a great potential for telepresence applications like remote monitoring and remote medical imaging. Under this circumstance, video-rate and high resolution are two crucial characteristics for building a good depth map, which however mutually contradict during the depth sensor capturing. Therefore, recent works prefer to only transmit the high-resolution color video to the terminal side, and subsequently the scene depth is reconstructed by estimating the motion vectors from the video, typically using the propagation based methods towards a video-rate depth reconstruction. However, in most of the remote transmission systems, only the compressed color video stream is available. As a result, color video restored from the streams has quality losses, and thus the extracted motion vectors are inaccurate for depth reconstruction. In this paper, we propose a precise and robust scheme for dynamic 3D scene reconstruction by using the compressed color video stream and their inaccurate motion vectors. Our method rectifies the inaccurate motion vectors by analyzing and compensating their quality losses, motion vector absence in spatial prediction, and dislocation in near-boundary region. This rectification ensures the depth maps can be compensated in both video-rate and high resolution at the terminal side towards reducing the system consumption on both the compression and transmission. Our experiments validate that the proposed scheme is robust for depth map and dynamic scene reconstruction on long propagation distance, even with high compression ratio, outperforming the benchmark approaches with at least 3.3950 dB quality gains for remote applications.

  • Research Article
  • 10.6100/ir716683
Acquiring 3D scene information from 2D images
  • May 1, 2004
  • P Ping Li

Acquiring 3D scene information from 2D images

  • Research Article
  • Cite Count Icon 2
  • 10.1049/ipr2.13144
Light field imaging technology for virtual reality content creation: A review
  • Jun 14, 2024
  • IET Image Processing
  • Ali Khan + 4 more

The light field (LF) imaging technique can capture 3D scene information in 4D by recording both 2D intensity and 2D direction of incoming light rays. Due to this capability, LF has shown a great interest in virtual reality (VR) and augmented reality (AR) for enhanced immersion, improved depth perception and reconstruction of realistic 3D environments. This paper presents a comprehensive review of LF imaging technology and other approaches used for VR content creation. The applications of LF technology beyond VR and AR are also discussed. The challenges and limitations of other approaches for VR content creation are examined. State‐of‐the‐art research has focused on how VR experiences benefit from LF technology and identified the challenges to creating comfortable, immersive and realistic VR content such as (1) image size and resolution, (2) processing speed, (3) precise calibration and (4) depth reconstruction. Recommendations that can be considered for creating immersive VR content are provided to enhance user experience. These recommendations aim to contribute to developing more comfortable and realistic VR content, extending the potential applications of LF imaging technology in diverse fields.

  • Conference Article
  • Cite Count Icon 1
  • 10.1117/12.131624
<title>Stochastic structure estimation by motion</title>
  • Nov 1, 1992
  • Arcangelo Distante + 4 more

A field of great interest in computer vision is depth reconstruction by motion. The final goal is the computation of the visible surface structure in a 3D scene by analyzing a sequence of digital images acquired moving a camera in the environment. This paper describes a method of depth reconstruction based on stochastic modeling of the motion, the image acquisition processes, and the 3D-2D projection. The stochastic model is based on the well-known extended Kalman filter to derive an optimized depth estimation: it integrates successive views by using a pair of optical flow equations that we have adapted to a general pin-hole camera model (linear transformation from 3D to 2D coordinates). In comparison with similar methods we developed a reconstruction system to improve the speed of the estimation process and its stability by means of a multi-scale approach and used a massive parallel MIMD machine to speed up globally the estimation process.

  • Conference Article
  • Cite Count Icon 11
  • 10.1109/3dtv.2007.4379475
3D Scene Reconstruction System with Hand-Held Stereo Cameras
  • May 1, 2007
  • Sangun Yun + 2 more

3D scene modeling is a challenging problem and has been one of the most important research topic for many years. In this paper, we describe the 3D scene reconstruction system that creates 3D models with multiple stereo image pairs acquired by hand-held device. Our algorithm consists of the following two steps, which is depth reconstruction and model registration. In the first part, we obtain the depth map with stereo matching and camera geometry in each view. The algorithm is based on adaptive window methods in hierarchical frameworks. In the second part, we use SIFT feature to estimate the camera motion. LMedS algorithm reduces the effect of outliers in this process. Experimental results show that the proposed algorithm provides accurate disparity map in various types of images, and the 3D model of real world's scene.

  • Conference Article
  • Cite Count Icon 7
  • 10.1109/cyber.2003.1253451
Context modeling based depth image compression for distributed virtual environment
  • Dec 3, 2003
  • P Bao + 2 more

Depth images, comprising pixel intensities and depth map, are viewed as a compact model of 3D scenes and used in 3D image warping for the distributed or collaborative virtual environment to enable the distributed rendering of complex 3D scenes at relatively low cost. The major overhead of the model is the transmission of the depth images of the initial reference and the subsequent new views that may be up to a few Mbs in size depending on the screen resolution. This paper presents efficient compression techniques specifically designed for the depth images. Additionally we experiment with the warped image quality by reducing the resolution of the depth map. We show that significant compression can be achieved at a cost of a very modest impairment in the perceptual quality of the warped image.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.cag.2024.104139
Novel view synthesis with wide-baseline stereo pairs based on local–global information
  • Nov 29, 2024
  • Computers & Graphics
  • Kai Song + 1 more

Novel view synthesis with wide-baseline stereo pairs based on local–global information

  • Research Article
  • 10.1109/access.2020.2996689
A General Framework for Depth Compression and Multi-Sensor Fusion in Asymmetric View-Plus-Depth 3D Representation
  • Jan 1, 2020
  • IEEE Access
  • Mihail Georgiev + 2 more

We present a general framework which can handle different processing stages of the three-dimensional (3D) scene representation referred to as “view-plus-depth” (V+Z). The main component of the framework is the relation between the depth map and the super-pixel segmentation of the color image. We propose a hierarchical super-pixel segmentation which keeps the same boundaries between hierarchical segmentation layers. Such segmentation allows for a corresponding depth segmentation, decimation and reconstruction with varying quality and is instrumental in tasks such as depth compression and 3D data fusion. For the latter we utilize a cross-modality reconstruction filter which is adaptive to the size of the refining super-pixel segments. We propose a novel depth encoding scheme, which includes specific arithmetic encoder and handles misalignment outliers. We demonstrate that our scheme is especially applicable for low bit-rate depth encoding and for fusing color and depth data, where the latter is noisy and with lower spatial resolution.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/igarss.2016.7730737
3D super resolution scene depth reconstruction based on SkySat video image sequences
  • Jul 1, 2016
  • Xue Wan + 4 more

Traditional DEM (Digital Elevation Model) generation from satellite imagery is based on stereo pair or triple images using wide baseline. These days, a number of low-cost microsatellites, such as SkySat, have been launched. The high resolution video image sequences they provided result in large number of image frames and more flexible selection of baseline, which allows us to design a 3D super resolution scene reconstruction approach based on multiple narrow baseline stereo pairs. The multiple disparity maps, generated using Phase Correlation (PC) based sub-pixel stereo matching algorithm, are co-registered pixel by pixel and up-sampled and then stacked to produce a super resolution scene depth map. The scene depth image constructed in this way has three advantages: i) “pixel locking” error typical for sub-pixel image matching is minimized; ii) super resolution is achieved; and iii) occlusion problem is minimized by multiple narrow baseline stereo pairs. A 3D scene super resolution reconstruction example is demonstrated using a SkySat video image of Usak, Western Turkey.

  • Conference Article
  • Cite Count Icon 3
  • 10.1145/3306307.3328193
Depth boost
  • Jul 28, 2019
  • Yamato Miyashita + 3 more

A key challenge of volumetric displays is presenting a 3D scene as if naturally existed in the physical space. However, the displayable scenes are limited because current volumetric displays do not have a substantial depth reconstruction capability to show scenes with significant depth. In this talk, we propose a dynamic depth compression method that modifies the 3D geometries of presented scenes while considering changes to the spectator's view point such that entire scenes are fitted within a smaller depth range while maintaining the perceptual quality. Extensive depth compression induces a feeling of unnaturalness in viewers, but the results of an evaluation experiment using a volumetric display simulator indicated that a depth of just 10 cm was needed to show scenes that originally had about 50 m without an unacceptable feeling of unnaturalness. We applied our method to a real volumetric display and validated our findings through an additional user study. The results suggest that our method works well as a virtual extender of a volumetric display's depth reconstruction capability, enabling hundreds of times larger depth reconstruction than that of current volumetric displays.

  • Conference Article
  • Cite Count Icon 88
  • 10.1109/cvpr.2015.7299022
Holistic 3D scene understanding from a single geo-tagged image
  • Jun 1, 2015
  • Shenlong Wang + 2 more

In this paper we are interested in exploiting geographic priors to help outdoor scene understanding. Towards this goal we propose a holistic approach that reasons jointly about 3D object detection, pose estimation, semantic segmentation as well as depth reconstruction from a single image. Our approach takes advantage of large-scale crowd-sourced maps to generate dense geographic, geometric and semantic priors by rendering the 3D world. We demonstrate the effectiveness of our holistic model on the challenging KITTI dataset [13], and show significant improvements over the baselines in all metrics and tasks.

  • Conference Article
  • Cite Count Icon 129
  • 10.1109/wvrs.1995.476848
Physically-valid view synthesis by image interpolation
  • Jun 21, 1995
  • S.M Seitz + 1 more

Image warping is a popular tool for smoothly transforming one image to another. "Morphing" techniques based on geometric image interpolation create compelling visual effects, but the validity of such transformations has not been established. In particular, does 2D interpolation of two views of the same scene produce a sequence of physically valid in-between views of that scene? We describe a simple image rectification procedure which guarantees that interpolation does in fact produce valid views, under generic assumptions about visibility and the projection process. Towards this end, it is first shown that two basis views are sufficient to predict the appearance of the scene within a specific range of new viewpoints. Second, it is demonstrated that interpolation of the rectified basis images produces exactly this range of views. Finally, it is shown that generating this range of views is a theoretically well-posed problem, requiring neither knowledge of camera positions nor 3D scene reconstruction. A scanline algorithm for view interpolation is presented that requires only four user-provided feature correspondences to produce valid orthographic views. The quality of the resulting images is demonstrated with interpolations of real imagery.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/11922162_74
PanoWalk: A Remote Image-Based Rendering System for Mobile Devices
  • Jan 1, 2006
  • Zhongding Jiang + 6 more

Real-time rendering of complex 3D scene on mobile devices is a challenging task. The main reason is that mobile devices have limited computational capabilities and are lack of powerful 3D graphics hardware support. In this paper, we propose a remote Image-Based Rendering system for mobile devices to interactively visualize real world and synthetic scenes under wireless network. Our system uses panoramic video as building block of representing scene data. The scene data is compressed with one MPEG like encoding scheme tailored for mobile device. The compressed data is stored on remote server. Our system carefully partitions the rendering task between client and server. The server is responsible for determining the required data for rendering novel views. It streams the required data to client in server pushing manner. After receiving data, mobile client carries out rendering locally using image warping and displays the resultant images onto its small screen. Experimental results show that our system can achieve real time rendering speed on mainstream mobile devices. It allows multiple mobile clients to explore the same or different scenes simultaneously.

  • Conference Article
  • Cite Count Icon 40
  • 10.1145/1180639.1180785
Remote rendering and streaming of progressive panoramas for mobile devices
  • Oct 23, 2006
  • Azzedine Boukerche + 1 more

Providing mobile devices with virtual environment walkthrough and real-time streaming movie playback capability is expected to have a profound impact to the entertainment-based applications, such as virtual guides, online gaming, and e-learning, just to name a few. However, it is well known that it is extremely difficult to render complex 3D scenes at interactive frame rates on thin mobile devices known for their lack of proper resources needed to process large volume of 3D virtual environment data. In order to provide virtual environment navigation on thin mobile clients, we propose a hybrid technique which combines both remote geometry rendering and streaming of warped images. In our approach, the server renders a partial panoramic view, which is based on the user's viewpoint and last movements. The server then warps the image's coordinates into cylindrical coordinates, and streams the images to the client device, which will progressively build the panoramic representation of the scene. Furthermore, in order to enhance streaming performance and quality of the interaction, we propose to use a rate control mechanism as well as a prediction of the user's movements within the virtual scene. In this paper we discuss our scheme for remote rendering and streaming of progressive panoramas for mobile devices, and present our experimental results we have obtained in order to validate our proposed technique. Our results indicate clearly that the proposed solution is able to achieve stable frame rates and throughput in error-prone wireless channels.

  • Book Chapter
  • Cite Count Icon 12
  • 10.1007/978-3-540-30207-0_89
Image-Based Walkthrough over Internet on Mobile Devices
  • Jan 1, 2004
  • Yu Lei + 3 more

Real-time rendering of complex 3D scene on mobile devices is a challenging task. The main reason is that mobile devices have limited computational capabilities and are lack of powerful 3D graphics hardware support. In this paper, we propose an Image-Based Rendering (IBR) system for mobile devices to visualize real-world or synthetic scenes in network environment. Our system uses server for computing the required image segments of pre-captured panoramic video, and transmitting them to client. After receiving data, mobile client carries out rendering using simple image warping. The rendering process needs less computational power and is insensitive to the scene’s complexity. A rate-control scheme is designed for efficient use of network bandwidth for handling network congestion. Pre-fetching and cache management are also considered on client and server sides for efficient memory use and reducing transmission request. With this client-server architecture and local rendering scheme, interactive exploration of 3D scene on mobile devices becomes possible. Experimental results show that our system can achieve acceptable rendering speed on common mobile devices.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.