Stunning View Synthesis Algorithm Could Have Huge Implications for VR Capture

15

As far as live-action VR video is concerned, volumetric video is the gold standard for immersion. And for static scene capture, the same holds true for photogrammetry. But both methods have limitations that detract from realism, especially when it comes to ‘view-dependent’ effects like specular highlights and lensing through translucent objects. Research from Thailand’s Vidyasirimedhi Institute of Science and Technology shows a stunning view synthesis algorithm that significantly boosts realism by handling such lighting effects accurately.

Researchers from the Vidyasirimedhi Institute of Science and Technology in Rayong Thailand published work earlier this year on a real-time view synthesis algorithm called NeX. It’s goal is to use just a handful of input images from a scene to synthesize new frames that realistically portray the scene from arbitrary points between the real images.

Researchers Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, and Supasorn Suwajanakorn write that the work builds on top of a technique called multiplane image (MPI). Compared to prior methods, they say their approach better models view-dependent effectis (like specular highlights) and creates sharper synthesized imagery.

On top of those improvements, the team has highly optimized the system, allowing it to run easily at 60Hz—a claimed 1000x improvement over the previous state of the art. And I have to say, the results are stunning.

Though not yet highly optimized for the use-case, the researchers have already tested the system using a VR headset with stereo-depth and full 6DOF movement.

The researchers conclude:

Our representation is effective in capturing and reproducing complex view-dependent effects and efficient to compute on standard graphics hardware, thus allowing real-time rendering. Extensive studies on public datasets and our more challenging dataset demonstrate state-of-art quality of our approach. We believe neural basis expansion can be applied to the general problem of light-field factorization and enable efficient rendering for other scene representations not limited to MPI. Our insight that some reflectance parameters and high-frequency texture can be optimized explicitly can also help recovering fine detail, a challenge faced by existing implicit neural representations.

You can find the full paper at the NeX project website, which includes demos you can try for yourself right in the browser. There’s also WebVR-based demos that work with PC VR headsets if you’re using Firefox, but unfortunately don’t work with Quest’s browser.

Notice the reflections in the wood and the complex highlights in the pitcher’s handle! View-dependent details like these are very difficult for existing volumetric and photogrammetric capture methods.

Volumetric video capture that I’ve seen in VR usually gets very confused about these sort of view-dependent effects, often having trouble determining the appropriate stereo depth for specular highlights.

SEE ALSO
'Batman: Arkham Shadow' Behind-the-scenes – Insights & Artwork from Camouflaj

Photogrammetry, or ‘scene scanning’ approaches, typically ‘bake’ the scene’s lighting into textures, which often makes translucent objects look like cardboard (since the lighting highlights don’t move correctly as you view the object at different angles).

The NeX view synthesis research could significantly improve the realism of volumetric capture and playback in VR going forward.

This article may contain affiliate links. If you click an affiliate link and buy a product we may receive a small commission which helps support the publication. See here for more information.

Ben is the world's most senior professional analyst solely dedicated to the XR industry, having founded Road to VR in 2011—a year before the Oculus Kickstarter sparked a resurgence that led to the modern XR landscape. He has authored more than 3,000 articles chronicling the evolution of the XR industry over more than a decade. With that unique perspective, Ben has been consistently recognized as one of the most influential voices in XR, giving keynotes and joining panel and podcast discussions at key industry events. He is a self-described "journalist and analyst, not evangelist."