Physical Vision Group

MSCA Fellowship

SYN3D: Synthesizing Photorealistic 3D Scene from Zero to One or Limited Views

MSCA Postdoctoral Fellowship (UKRI Horizon Europe)

The grant was awarded on 13/02/2024 (£ 206,085.62), and started on 01/06/2024 and will end on 01/06/2026.

Objectives

The aim of SYN3D is to enable the effortless creation of novel view synthesis and 3D reconstruction from limited views (from 0 to 5) in a feed-forward manner. In general, such a few input views do not contain sufficient information for compressive 3D synthesis. However, our scientific hypothesis is that 3D synthesis from limited views is still possible by learning, from billion images, the capability of hallucinating unobserved parts of 3D objects/scenes, thus obtaining plausible if not faithful reconstructions.

To achieve the objectives of SYN3D, the project focuses on the following key objectives:

To develop a feed-forward 3D scene synthesis model that reconstructs photorealistic 3D scenes from zero to one or limited views.

To develop a strategy for training the model on large-scale image datasets for generalizable 3D reconstruction across different scenes.

To decompose the 3D scene into semantically meaningful components, enabling better manipulation of the synthesized scenes.

Selected Results

MVSplat, MVSplat-360Feed-forward 3D scene synthesis from limited views (even for 360 degree)

Free3DFeed-forward 3D object synthesis from a single image for open-set categories

Publications

2025

Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction

ICCV, 2025

Zeren Jiang, Chuanxia Zheng, Iro Laina, Diane Larlus, Andrea Vedaldi

A method to repurpose video diffusion models for monocular 3D reconstruction of dynamic scenes.

🌐 website

arxiv code

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

ICCV, 2025

Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi

A feed-forward approach to increase the likelihood that the 3D generator outputs stable 3D objects directly.

🌐 website

arxiv code 🔗 data

Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images

ICCV, 2025

Tianhao Wu, Chuanxia Zheng, Frank Guan, Andrea Vedaldi, Tat-Jen Cham

A model for completed 3D reconstruction from partially visible inputs.

🌐 website

arxiv code 🔗 demo

Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics

ICCV, 2025

Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi

An interactive video generative model that can serve as a motion prior for part-level dynamics.

🌐 website

arxiv code 🔗 data 🔗 demo

Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image

3DV, 2025

Stanislaw Szymanowicz*, Eldar Insafutdinov*, Chuanxia Zheng*, Dylan Campbell, João Henriques, Christian Rupprecht, Andrea Vedaldi

A fast, super efficient, trainable on a single GPU in one day for scene 3D reconstruction from a single image.

🌐 website

arxiv code

2024

Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs

arXiv, 2024

Brandon Smart, Chuanxia Zheng, Iro Laina, Victor Adrian Prisacariu

A Feed-Forward model that reconstruct 3D structure and apperance with uncalibrated images.

🌐 website

arxiv code 🔗 demo

MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

NeurIPS, 2024

Yuedong Chen, Chuanxia Zheng, Haofei Xu, Bohan Zhuang, Andrea Vedaldi, Tat-Jen Cham, Jianfei Cai

A feed-forward approach for 360-degree scene-level novel view synthesis using only sparse observations.

🌐 website pdf

arxiv code

MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images

ECCV, 2024(oral)

Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai

A feed-forward approach for efficiently predicting 3D Gaussians from sparse multi-view images in a single forward pass.

🌐 website pdf

arxiv code

DragAPart: Learning a Part-Level Motion Prior for Articulated Objects

ECCV, 2024

Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi

A physical interaction with objects in vision for part-level dragging.

🌐 website pdf

arxiv code

ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition

ECCV, 2024

Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham, Qianyi Wu

A self-organized 3D segmentation/decomposition model via neural implicit surface representation.

🌐 website pdf

arxiv code

Free3D: Consistent Novel View Synthesis without 3D Representation

CVPR, 2024

Chuanxia Zheng, Andrea Vedaldi

A method to synthesize consistent novel views from a single image on open-set categories without the need of explicit 3D representations.

🌐 website pdf

arxiv 🎥 video code