RGBD-Net: Predicting Color and Depth Images for Novel Views Synthesis

Phong Nguyen, Animesh Karnewar, Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkilä

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

1 Downloads (Pure)


We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network. The former one predicts depth maps of the target views by using adaptive depth scaling, while the latter one leverages the predicted depths and renders spatially and temporally consistent target images. In the experimental evaluation on standard datasets, RGBD-Net not only outperforms the state-of-the-art by a clear margin, but it also generalizes well to new scenes without per-scene optimization. Moreover, we show that RGBD-Net can be optionally trained without depth supervision while still retaining high-quality rendering. Thanks to the depth regression network, RGBD-Net can be also used for creating dense 3D point clouds that are more accurate than those produced by some state-of-the-art multi-view stereo methods.
Original languageEnglish
Title of host publication2021 International Conference on 3D Vision (3DV)
Number of pages11
ISBN (Electronic)978-1-6654-2688-6
Publication statusPublished - 2021
Publication typeA4 Article in conference proceedings
EventInternational Conference on 3D Vision - London, United Kingdom
Duration: 1 Dec 20213 Dec 2021

Publication series

ISSN (Electronic)2475-7888


ConferenceInternational Conference on 3D Vision
Country/TerritoryUnited Kingdom


  • Point cloud compression
  • Three-dimensional displays
  • Adaptive systems
  • Image color analysis
  • Color
  • Rendering (computer graphics)
  • Cameras
  • novel view synthesis
  • rgbd
  • 3d reconstruction
  • neural rendering

Publication forum classification

  • Publication forum level 1


Dive into the research topics of 'RGBD-Net: Predicting Color and Depth Images for Novel Views Synthesis'. Together they form a unique fingerprint.

Cite this