iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views

3DV 2025

  • 1National Tsing Hua University
  • 2Microsoft
  • 3Amazon
TL;DR: We enable pose-free reconstruction from just two views by harnessing Zero123 as a camera pose estimator.

Abstract

We present iFusion, a novel 3D object reconstruction framework that requires only two views with unknown camera poses. While single-view reconstruction yields visually appealing results, it can deviate significantly from the actual object, especially on unseen sides. Additional views improve reconstruction fidelity but necessitate known camera poses. However, assuming the availability of pose may be unrealistic, and existing pose estimators fail in sparse view scenarios. To address this, we harness a pre-trained novel view synthesis diffusion model, which embeds implicit knowledge about the geometry and appearance of diverse objects. Our strategy unfolds in three steps: (1) We "invert" the diffusion model for camera pose estimation instead of synthesizing novel views. (2) The diffusion model is fine-tuned using provided views and estimated poses, turned into a novel view synthesizer tailored for the target object. (3) Leveraging registered views and the fine-tuned diffusion model, we reconstruct the 3D object. Experiments demonstrate strong performance in both pose estimation and novel view synthesis. Moreover, iFusion seamlessly integrates with various reconstruction methods and enhances them.


Method

Given as few as two images without poses, we estimate the camera pose by making the frozen diffusion model optimally reconstruct the input views. Based on the estimate, we efficiently fine-tune the model to customize it for synthesizing novel views of the given object with enhanced fidelity.


Novel View Synthesis

With the estimated camera pose, we perform multi-view fine-tuning and conditioning to generate more accurate and faithful novel views. Note that Zero123 only takes view1 as input.

Input
GT
iFusion
Zero123

Citation

@inproceedings{wu2025ifusion,
    title={iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views},
    author={Wu, Chin-Hsuan and Chen, Yen-Chun, Solarte, Bolivar and Yuan, Lu and Sun, Min},
    booktitle={3DV},
    year={2025},
}