Abstract

This work introduces a perspective-corrected video see-through mixed-reality head-mounted display with edge-preserving occlusion and low-latency capabilities. To realize the consistent spatial and temporal composition of a captured real world containing virtual objects, we perform three essential tasks: 1) to reconstruct captured images so as to match the user's view; 2) to occlude virtual objects with nearer real objects, to provide users with correct depth cues; and 3) to reproject the virtual and captured scenes to be matched and to keep up with users' head motions. Captured image reconstruction and occlusion-mask generation require dense and accurate depth maps. However, estimating these maps is computationally difficult, which results in longer latencies. To obtain an acceptable balance between spatial consistency and low latency, we rapidly generated depth maps by focusing on edge smoothness and disocclusion (instead of fully accurate maps), to shorten the processing time. Our algorithm refines edges via a hybrid method involving infrared masks and color-guided filters, and it fills disocclusions using temporally cached depth maps. Our system combines these algorithms in a two-phase temporal warping architecture based upon synchronized camera pairs and displays. The first phase of warping is to reduce registration errors between the virtual and captured scenes. The second is to present virtual and captured scenes that correspond with the user's head motion. We implemented these methods on our wearable prototype and performed end-to-end measurements of its accuracy and latency. We achieved an acceptable latency due to head motion (less than 4 ms) and spatial accuracy (less than 0.1° in size and less than 0.3° in position) in our test environment. We anticipate that this work will help improve the realism of mixed reality systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call