TY - GEN
T1 - Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs
AU - Ghorbani Lohesara, Fatemeh
AU - Eguiazarian, Karen
AU - Knorr, Sebastian
N1 - Publisher Copyright:
© 2024, Society for Imaging Science and Technology.
PY - 2024
Y1 - 2024
N2 - Facial video inpainting plays a crucial role in a wide range of applications, including but not limited to the removal of obstructions in video conferencing and telemedicine, enhancement of facial expression analysis, privacy protection, integration of graphical overlays, and virtual makeup. This domain presents serious challenges due to the intricate nature of facial features and the inherent human familiarity with faces, heightening the need for accurate and persuasive completions. In addressing challenges specifically related to occlusion removal in this context, our focus is on the progressive task of generating complete images from facial data covered by masks, ensuring both spatial and temporal coherence. Our study introduces a network designed for expression-based video inpainting, employing generative adversarial networks (GANs) to handle static and moving occlusions across all frames. By utilizing facial landmarks and an occlusion-free reference image, our model maintains the user’s identity consistently across frames. We further enhance emotional preservation through a customized facial expression recognition (FER) loss function, ensuring detailed inpainted outputs. Our proposed framework exhibits proficiency in eliminating occlusions from facial videos in an adaptive form, whether appearing static or dynamic on the frames, while providing realistic and coherent results.
AB - Facial video inpainting plays a crucial role in a wide range of applications, including but not limited to the removal of obstructions in video conferencing and telemedicine, enhancement of facial expression analysis, privacy protection, integration of graphical overlays, and virtual makeup. This domain presents serious challenges due to the intricate nature of facial features and the inherent human familiarity with faces, heightening the need for accurate and persuasive completions. In addressing challenges specifically related to occlusion removal in this context, our focus is on the progressive task of generating complete images from facial data covered by masks, ensuring both spatial and temporal coherence. Our study introduces a network designed for expression-based video inpainting, employing generative adversarial networks (GANs) to handle static and moving occlusions across all frames. By utilizing facial landmarks and an occlusion-free reference image, our model maintains the user’s identity consistently across frames. We further enhance emotional preservation through a customized facial expression recognition (FER) loss function, ensuring detailed inpainted outputs. Our proposed framework exhibits proficiency in eliminating occlusions from facial videos in an adaptive form, whether appearing static or dynamic on the frames, while providing realistic and coherent results.
U2 - 10.2352/EI.2024.36.10.IPAS-246
DO - 10.2352/EI.2024.36.10.IPAS-246
M3 - Conference contribution
AN - SCOPUS:85196520606
T3 - IS and T International Symposium on Electronic Imaging Science and Technology
SP - 246-1 - 246-6
BT - IS&T International Symposium on Electronic Imaging 2024
PB - Society for Imaging Science and Technology
T2 - IS and T International Symposium on Electronic Imaging
Y2 - 21 January 2024 through 25 January 2024
ER -