MILG: Realistic lip-sync video generation with audio-modulated image inpainting

Han Bao,Xuhong Zhang,Qinying Wang,Kangming Liang,Zonghui Wang,Shouling Ji,Wenzhi Chen

doi:10.1016/j.visinf.2024.08.002

Han Bao, Xuhong Zhang + Show 5 more

Open Access

https://doi.org/10.1016/j.visinf.2024.08.002

Copy DOI

Export

Save

Cite

Journal: Visual Informatics	Publication Date: Sep 1, 2024
License type: cc-by-nc-nd

Abstract
Full-Text
Similar Papers

Abstract

Listen

Existing lip synchronization (lip-sync) methods generate accurately synchronized mouths and faces in a generated video. However, they still confront the problem of artifacts in regions of non-interest (RONI), e.g., background and other parts of a face, which decreases the overall visual quality. To solve these problems, we innovatively introduce diverse image inpainting to lip-sync generation. We propose Modulated Inpainting Lip-sync GAN (MILG), an audio-constraint inpainting network to predict synchronous mouths. MILG utilizes prior knowledge of RONI and audio sequences to predict lip shape instead of image generation, which can keep the RONI consistent. Specifically, we integrate modulated spatially probabilistic diversity normalization (MSPD Norm) in our inpainting network, which helps the network generate fine-grained diverse mouth movements guided by the continuous audio features. Furthermore, to lower the training overhead, we modify the contrastive loss in lip-sync to support small-batch-size and few-sample training. Extensive experiments demonstrate that our approach outperforms the existing state-of-the-art of image quality and authenticity while keeping lip-sync.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

MILG: Realistic lip-sync video generation with audio-modulated image inpainting

Abstract

Published Version

Talk to us

Similar Papers

More From: Visual Informatics

Lead the way for us

Similar Papers

Dual Watermarking of CT Scan Medical Images for Content Authentication and Copyright Protection
Nisar Ahmed Memon ... Fatima Abbasi
-
Nisar Ahmed Memon, et. al.Nisar Ahmed Memon ... Fatima Abbasi
01 Jan 2014
01 Jan 2014

Patient Data Hiding and Integrity Control Using Prediction-Based Watermarking for Brain MRI and CT Scan Images
Nuha Omran Abokhdair ... Fatma Susilawati Mohamad
Journal of Medical Imaging and Health Informatics | VOL. 8
Nuha Omran Abokhdair, et. al.Nuha Omran Abokhdair ... Fatma Susilawati Mohamad
01 May 2018
Journal of Medical Imaging and Health Informatics | VOL. 8

RONI-based steganographic method for 3D scene
Xiao-Wei Li ... Qiong-Hua Wang
-
Xiao-Wei Li, et. al.Xiao-Wei Li ... Qiong-Hua Wang
13 Jun 2017
13 Jun 2017

MEDICAL IMAGE WATERMARKING FOR AUTHENTICATION, CONFIDENTIALITY, TAMPER DETECTION AND RECOVERY
Priyanka Singh ... Ashok Kumar Pradhan
-
Priyanka Singh, et. al.Priyanka Singh ... Ashok Kumar Pradhan
01 Jul 2019
01 Jul 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

MILG: Realistic lip-sync video generation with audio-modulated image inpainting

Abstract

Published Version

Talk to us

Similar Papers

More From: Visual Informatics