Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction
A method to repurpose video diffusion models for monocular 3D reconstruction of dynamic scenes.
MSCA Postdoctoral Fellowship (UKRI Horizon Europe)
The grant was awarded on 13/02/2024 (£ 206,085.62), and started on 01/06/2024 and will end on 01/06/2026.
The aim of SYN3D is to enable the effortless creation of novel view synthesis and 3D reconstruction from limited views (from 0 to 5) in a feed-forward manner. In general, such a few input views do not contain sufficient information for compressive 3D synthesis. However, our scientific hypothesis is that 3D synthesis from limited views is still possible by learning, from billion images, the capability of hallucinating unobserved parts of 3D objects/scenes, thus obtaining plausible if not faithful reconstructions.
To achieve the objectives of SYN3D, the project focuses on the following key objectives:
MVSplat, MVSplat-360Feed-forward 3D scene synthesis from limited views (even for 360 degree)
Free3DFeed-forward 3D object synthesis from a single image for open-set categories
A method to repurpose video diffusion models for monocular 3D reconstruction of dynamic scenes.
A feed-forward approach to increase the likelihood that the 3D generator outputs stable 3D objects directly.
A modal for completed 3D reconstruction from partially visible inputs.
An interactive video generative model that can serve as a motion prior for part-level dynamics.
A fast, super efficient, trainable on a single GPU in one day for scene 3D reconstruction from a single image.
A Feed-Forward model that reconstruct 3D structure and apperance with uncalibrated images.
A feed-forward approach for 360-degree scene-level novel view synthesis using only sparse observations.
A feed-forward approach for efficiently predicting 3D Gaussians from sparse multi-view images in a single forward pass.
A physical interaction with objects in vision for part-level dragging.
A self-organized 3D segmentation/decomposition model via neural implicit surface representation.
A method to synthesize consistent novel views from a single image on open-set categories without the need of explicit 3D representations.