Given only a single RGB image, POP3D can reconstruct a plausible 360°-view appearance and shape of the object.
We introduce POP3D, a novel framework that creates a full 360°-view 3D model from a single image. POP3D resolves two prominent issues that limit the single-view reconstruction. Firstly, POP3D offers substantial generalizability to arbitrary categories, a trait that previous methods struggle to achieve. Secondly, POP3D further improves reconstruction fidelity and naturalness, a crucial aspect that concurrent works fall short of. POP3D marries the strengths of four primary components: (1) a monocular depth and normal predictor that serves to predict crucial geometric cues, (2) a space carving method capable of demarcating the potentially unseen portions of the target object, (3) a generative model pre-trained on a large-scale image dataset that can complete unseen regions of the target, and (4) a neural implicit surface reconstruction method tailored in reconstructing objects using RGB images along with monocular geometric cues. The combination of these components enables POP3D to readily generalize across various in-the-wild images and generate state-of-the-art reconstructions, outperforming similar works by a significant margin.
POP3D operates in five interconnected steps. Initially, we process the single RGB input to create a preliminary pseudo-ground-truth dataset and use this data to initialize a 3D model. We then progress through a loop of steps aiming to cover the complete 360° view of the target object. This loop includes: updating the camera position according to a predetermined schedule; acquiring an outpainting mask by extracting the visual hull from the pseudo-ground-truth dataset and subtracting the seen area; generating a pseudo-ground-truth novel view using the initial novel view from the trained 3D model, outpainting mask, and a suitable text prompt; and training the 3D model using the updated pseudo-ground-truth dataset. This process continues until we encompass the 360° view of the object.
Here we show the 360° reconstruction results of POP3D from the sinlge RGB image given on the left. We show both the appearance and shape reconstruction results on the right.
@inproceedings{Ryu2023POP3D,
title = {$360^\circ$ Reconstruction From a Single Image Using Space Carved Outpainting},
author = {Nuri Ryu and Minsu Gong and Geonung Kim and Joo-Haeng Lee and Sunghyun Cho},
booktitle = {Proc. of ACM SIGGRAPH Asia},
year = {2023}}