Abstract

AbstractVirtual reality (VR) and augmented reality (AR) applications are becoming increasingly prevalent. However, constructing realistic 3D hands, especially when two hands are interacting, from a single RGB image remains a major challenge due to severe mutual occlusion and the enormous diversity of hand poses. In this article, we propose a disturbing graph contrastive learning strategy for two‐hand 3D reconstruction. This involves a graph disturbance network designed to generate graph feature pairs to enhance the consistency of the two‐hand pose features. A contrastive learning module leverages high‐quality generative features for a strong feature expression. We further propose a similarity distinguish method to divide positive and negative features for accelerating the model convergence. Additionally, a multi‐term loss is designed to balance the relation among the hand pose, the visual scale and the viewpoint position. Our model has achieved state‐of‐the‐art results in the InterHand2.6M benchmark. Ablation studies show the model's great ability to correct unreasonable hand movements. In subjective assessments, our graph disturbance learning method significantly improves the construction of realistic 3D hands, especially when two hands are interacting.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call