Abstract

The image-based virtual try-on network transfers the target garment item to the corresponding region of the human body. Due to its commercial value in online garment shopping, it has attracted extensive attention from researchers. However, the previous virtual try-on methods are interfered heavily by garments in reference images, so they have defects in maintaining details of human upper limbs, neck, and given garment. Therefore, a novel High Fidelity Virtual Try-on Network via Semantic Adaptation (VTON-HF) is proposed to generate a result with better details. The main processes are as follows: 1) Thin Plate Spline (TPS) warps the target garment coarsely, 2) parsing network generates a target semantic map with the coarse warped garment, 3) our novel Semantic Map-based Image Adjustment Network (SMIAN) generates components separately to avoid interference between image parts with different semantics, 4) SMIAN fuses all components to generate the final result. VTON-HF can retain the maximum amount of detail in the reference garment than previous methods. Our novel architecture generates desired results by fusing separately generated components (garment, upper limb, and neck) and unchanging parts of the reference image. Moreover, our SMIAN incorporates a priori multimodal information in the tail layer, which effectively improves the convergence efficiency of the network. Our method achieves state-of-the-art quantitative results on IS, SSIM, PSNR, and FID using the VITON dataset. (see Fig. 1).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call