An evaluation of SVBRDF Prediction from Generative Image Models for Appearance Modeling of 3D Scenes

Eurographics Symposium on Rendering 2025 (Symposium Track)

1Inria & Université Côte d'Azur, 2Adobe Research, 3MIT

1 2 3

Abstract

Digital content creation is experiencing a profound change with the advent of deep generative models. For texturing, conditional image generators now allow the synthesis of realistic RGB images of a 3D scene that align with the geometry of that scene. For appearance modeling, SVBRDF prediction networks recover material parameters from RGB images. Combining these technologies allows us to quickly generate SVBRDF maps for multiple views of a 3D scene, which can be merged to form a SVBRDF texture atlas of that scene. In this paper, we analyze the challenges and opportunities for SVBRDF prediction in the context of such a fast appearance modeling pipeline. On the one hand, single-view SVBRDF predictions might suffer from multiview incoherence and yield inconsistent texture atlases. On the other hand, generated RGB images, and the different modalities on which they are conditioned, can provide additional information for SVBRDF estimation compared to photographs. We compare neural architectures and conditions to identify designs that achieve high accuracy and coherence. We find that, surprisingly, a standard UNet is competitive with more complex designs.

Video

BibTeX


      @inproceedings{
        gauthier2025evaluation,
        booktitle = {Eurographics Symposium on Rendering},
        title = {{An evaluation of SVBRDF Prediction from Generative Image Models for Appearance Modeling of 3D Scenes}},
        author = {Gauthier, Alban and Deschaintre, Valentin and Lanvin, Alexandre and Durand, Fredo and Bousseau, Adrien and Drettakis, George},
        year = {2025},
        publisher = {The Eurographics Association},
      }
  

Acknowledgments and Funding

This work was funded by the European Research Council (ERC) Advanced Grant NERPHYS, number 101141721 https://project.inria.fr/nerphys. The authors are grateful to the OPAL infrastructure of the Université Côte d'Azur for providing resources and support, as well as Adobe and NVIDIA for software and hardware donations. This work was granted access to the HPC resources of IDRIS under the allocation AD011015561 made by GENCI. F. Durand acknowledges funding from Google, Amazon, and MIT-GIST.