NVS-Adapter:
Plug-and-Play Novel View Synthesis from a Single Image

ECCV2024


Yoonwoo Jeong*,1, Jinwoo Lee*,2, Chiheon Kim3, Minsu Cho1, Doyup Lee3

*Equal Contriubtion   
1POSTECH    2Cinamon     3Runway    

Announcement

Our paper is updated to the ECCV format! Please checkout the recent version!

TL;DR

Our NVS-Adapter is capable of generating novel views from a reference image while preserving the generation capacity of the pretrained T2I model.




Abstract

Transfer learning of large-scale Text-to-Image (T2I) models has recently shown impressive potential for Novel View Synthesis (NVS) of diverse objects from a single image. While previous methods typically train large models on multi-view datasets for NVS, fine-tuning the whole parameters of T2I models not only demands a high cost but also reduces the generalization capacity of T2I models in generating diverse images in a new domain. In this study, we propose an effective method, dubbed NVS-Adapter, which is a plug-and-play module for a T2I model, to synthesize novel multi-views of visual objects while fully exploiting the generalization capacity of T2I models. NVS-Adapter consists of two main components; view-consistency cross-attention learns the visual correspondences to align the local details of view features, and global semantic conditioning aligns the semantic structure of generated views with the reference view. Experimental results demonstrate that the NVS-Adapter can effectively synthesize geometrically consistent multi-views and also achieve high performance on benchmarks without full fine-tuning of T2I models.



Compatibility with ControlNet

Our NVS-Adapter is fully compatible with ControlNets without any additional training.




Compatibility with LoRA

Our NVS-Adapter is also fully compatible with LoRA modules without any additional training.




Novel View Synthesis Examples




Results on the images generated by SD




Image to 3D Model with Score Distillation Sampling



Text to Image to 3D Model with Score Distillation Sampling



Citation


@inproceeding{jeong2024nvsadapter,
    title={NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image},
    booktitle={Computer Vision -- ECCV 2024},
    publisher={Springer},
    author={Yoonwoo Jeong, Jinwoo Lee, Chiheon Kim, Minsu Cho, and Doyup Lee},
    year={2024}
}