Researchers have developed a method to generate a 360-degree view of a person with a consistent, high-resolution appearance from a single input image, called shape-guided diffusion. This is accomplished by first synthesizing multiple views of the human in the input image by inpainting missing regions with shape-guided diffusion conditioned on silhouette and surface normal.
These multiple views are then fused through inverse rendering to obtain a fully textured high-resolution 3D mesh of the given person. What they ended up with is a method capable of achieving photo realistic 360-degree synthesis of a wide range of clothed humans with complex textures from a single image.
- NVIDIA GeForce RTX 3060 12GB GDDR6 dedicated graphics card
- 1710 MHz GPU clock speed and 1807 MHz memory clock speed
- DisplayPort x 3 (v1.4a) and HDMI 2.1 x 1 output interfaces
In each iteration, we differentiably render the UV texture map in every synthesized view from our set of views. We minimize the reconstruction loss between the rendered view and our synthesized view using both LPIPS loss and L1 loss. The fusion results in a textured mesh that can be rendered from any view,” said the researchers.