Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation

Kim, J; Oh, C; Do, H; Kim, S; Sohn, K; IEEE/CVF International Conference on Computer Vision and Pattern Recognition 2024

dc.contributor.author	Kim, J	en_US
dc.contributor.author	Oh, C	en_US
dc.contributor.author	Do, H	en_US
dc.contributor.author	Kim, S	en_US
dc.contributor.author	Sohn, K	en_US
dc.contributor.author	IEEE/CVF International Conference on Computer Vision and Pattern Recognition 2024	en_US
dc.date.accessioned	2024-05-17T15:08:15Z
dc.date.available	2024-02-26	en_US
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/96952
dc.description.abstract	We present a new multi-modal face image generation method that converts a text prompt and a visual input, such as a semantic mask or scribble map, into a photo-realistic face image. To do this, we combine the strengths of Generative Adversarial networks (GANs) and diffusion models (DMs) by employing the multi-modal features in the DM into the latent space of the pre-trained GANs. We present a simple mapping and a style modulation network to link two models and convert meaningful representations in feature maps and attention maps into latent codes. With GAN inversion, the estimated latent codes can be used to generate 2D or 3D-aware facial images. We further present a multi-step training strategy that reflects textual and structural representations into the generated image. Our proposed network produces realistic 2D, multi-view, and stylized face images, which align well with inputs. We validate our method by using pre-trained 2D and 3D GANs, and our results outperform existing methods. Our project page is available at https://github.com/1211sh/Diffusiondriven_GAN-Inversion/.
dc.title	Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation	en_US
dc.type	Conference Proceeding
dc.rights.holder	© 2024 IEEE.
pubs.notes	Not known	en_US
pubs.publication-status	Accepted	en_US
dcterms.dateAccepted	2024-02-26	en_US
rioxxterms.funder	Default funder	en_US
rioxxterms.identifier.project	Default project	en_US

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3475]

Show simple item record