FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera Manifold

News

March 2021: Our codebase has been upgraded! It now also contains:

COLMAP support for camera calibration and geometry reconstruction,
OpenGL-based training on headless machines,
Support for Windows and Linux,
Minor fixes and improvements.

Abstract

Current Generative Adversarial Networks (GANs) produce photorealistic renderings of portrait images. Embedding real images into the latent space of such models enables high-level image editing. While recent methods provide considerable semantic control over the (re-)generated images, they can only generate a limited set of viewpoints and cannot explicitly control the camera. Such 3D camera control is required for 3D virtual and mixed reality applications.

In our solution, we use a few images of a face to perform 3D reconstruction, and we introduce the notion of the GAN camera manifold, the key element allowing us to precisely define the range of images that the GAN can reproduce in a stable manner. We train a small face-specific neural implicit representation network to map a captured face to this manifold and complement it with a warping scheme to obtain free-viewpoint novel-view synthesis. We show how our approach – due to its precise camera control – enables the integration of a pre-trained StyleGAN into standard 3D rendering pipelines, allowing e.g., stereo rendering or consistent insertion of faces in synthetic 3D environments. Our solution proposes the first truly free-viewpoint rendering of realistic faces at interactive rates, using only a small number of casual photos as input, while simultaneously allowing semantic editing capabilities, such as facial expression or lighting changes.

Video

Method

Overview of our method.

Observing that the StyleGAN portrait model can only synthesize a limited range of views, we define a camera manifold which models the corresponding subspace of camera parameters (right). Images from cameras on the manifold (top left) can be generated using latent manipulations. To move away from the manifold, we render a flow field (bottom left) to warp the manifold view, obtaining free-view rendering (center). The flow field is often parallax-free, as perspective effects are already generated in the manifold view.

Free-view Camera Control

Our StyleGAN portrait renderings are generated using physically meaningful cameras. They can therefore be combined with other rendering techniques – path tracing in this case.

View-consistent Editing

We inherit all semantic editing capabilities from StyleGAN and use the method of Härkönen et al. [2020] to demonstrate view-consistent portrait manipulations.

BibTeX

@article{FreeStyleGAN2021,
	author = {Thomas Leimk\"uhler and George Drettakis},
	title = {FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera Manifold},
	booktitle = {ACM Transactions on Graphics (SIGGRAPH Asia)},
	publisher = {ACM},
	volume    = {40},
	number    = {6},
	year      = {2021},
	doi       = {10.1145/3478513.3480538}
}

Acknowledgments and Funding

This research was funded by the ERC Advanced grant FUNGRAPH No 788065. The authors are grateful to the OPAL infrastructure from Université Côte d'Azur for providing resources and support. The authors thank Ayush Tewari, Ohad Fried, and Siddhant Prakash for help with comparisons, Adrien Bousseau, Ayush Tewari, Julien Philip, Miika Aittala, and Stavros Diolatzis for proofreading earlier drafts, the anonymous reviewers for their valuable feedback, and all participants who helped capture the face datasets.