Abstract

Virtual fitting, in which a person’s image is changed to an arbitrary clothing image, is expected to be applied to shopping sites and videoconferencing. In real-time virtual fitting, image-based methods using a knowledge distillation technique can generate high-quality fitting images by inputting only the image of arbitrary clothing and a person without requiring the additional data like pose information. However, there are few studies that perform fast virtual fitting from arbitrary clothing images stably with real person images for situations such as videoconferencing considering temporal consistency. Therefore, the purpose of this demo is to perform robust virtual fitting with temporal consistency for videoconferencing. First, we created a virtual fitting system and verified how effective the existing fast image fitting method is for webcam video. The results showed that the existing methods do not adjust the dataset and do not consider temporal consistency, and thus are unstable for input images similar to videoconferencing. Therefore, we propose to train a model that adjusts the dataset to be similar to a videoconference and to add temporal consistency loss. Qualitative evaluation of the proposed model confirms that the model exhibits less flicker than the baseline. Figure 1 shows an example usage of our try-on system which is running on Zoom.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.