3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors (NIPS 2024 Spotlight)

Abstract

Novel-view synthesis aims to generate novel views of a scene from multiple input images or videos, and recent advancements like 3D Gaussian splatting (3DGS) have achieved notable success in producing photorealistic renderings with efficient pipelines. However, generating high-quality novel views under challenging settings, such as sparse input views, remains difficult due to insufficient information in under-sampled areas, often resulting in noticeable artifacts. This paper presents 3DGS-Enhancer, a novel pipeline for enhancing the representation quality of 3DGS representations. We leverage 2D video diffusion priors to address the challenging 3D view consistency problem, reformulating it as achieving temporal consistency within a video generation process. 3DGS-Enhancer restores view-consistent latent features of rendered novel views and integrates them with the input views through a spatial-temporal decoder. The enhanced views are then used to fine-tune the initial 3DGS model, significantly improving its rendering performance. Extensive experiments on large-scale datasets of unbounded scenes.




3DGS-Enhancer can enchance the low quality images from 3D Gaussian splatting



3DGS-Enhancer can enhance scenes represented by 3D Gaussians


3D Gaussian Splatting VS 3DGS-Enhancer



3D Gaussian Splatting (left) vs 3DGS-Enhancer (right). Scene trained on 9 views and images come from LLFF and DL3DV Dataset. Try selecting different scenes!

LLFF/fern LLFF/fortress LLFF/room DL3DV/21a6 DL3DV/51a8 DL3DV/efdf

Method Comparison on DL3DV


Image 1