Virtual reality is coming, and it's going to need high frame rates and low latency visuals. In this guest post from NVIDIA Graphics Programmer Nathan Reed, we take a deep dive into their forthcoming Gameworks VR initiative. What is it and exactly how will it help drive our forthcoming virtual reality experiences? Nathan Reed is a graphics programmer, amateur physicist, and sci-fi nerd. He teaches computers how to make pretty pictures and is excited by beautiful, immersive, story-driven games and interactive fiction. Nathan works on VR stuff at NVIDIA, and previously worked at Sucker Punch Productions on the Infamous series of games for PS3 and PS4. We know you’re excited about VR, and we are too! It’s no exaggeration to say it’s the most interesting thing to happen in gaming and entertainment in years. VR is the next frontier, the possibilities are endless, and we’re still just at the beginning. NVIDIA is getting into VR in a big way. VR games are going to require lots of GPU horsepower—with low latency, stereo rendering, high framerate, high resolution, and high-fidelity visuals all required to bring truly immersive experiences to life. As a GPU company, of course we’re going to rise to meet the new challenges that VR presents, and we’re going to do all we can to help VR game and headset developers use our GPUs to create the best VR experiences possible. To that end, we’ve built—and are continuing to build—GameWorks VR. GameWorks VR is the name for a suite of technologies we’re developing to tackle the challenges of high-performance VR rendering. It has several different components, some aimed at game engine developers and some aimed at headset developers. GameWorks VR technologies are available under a limited alpha program now, and we’re working closely with Oculus, Valve, Epic, and others to get these technologies road-tested and hopefully soon deployed to our customers around the world. For Engine Developers VR SLI People have two eyes, so a VR game has to perform stereo rendering. That increases both the CPU and GPU cost of rendering a frame quite a bit—in the worst case, almost doubling it. Some operations, such as physics simulations and shadow map rendering, can be shared across both stereo views. However, the actual rendering of the views themselves has to be done separately for each eye, to ensure correct parallax and depth cues that your brain can fuse into a perception of a 3D virtual world. It’s intuitively obvious that with two independent views, you can parallelize the rendering work across two GPUs for a massive improvement in performance. In other words, you render one eye on each GPU, and combine both images together into a single frame to send out to the headset. This reduces the amount of work each GPU is doing, and thus improves your framerate—or alternatively, it allows you to use higher graphics settings while staying above the headset’s 90 FPS refresh rate, and without hurting latency at all. [caption id="attachment_29282" align="aligncenter" width="640"] The Oculus Rift VR Headset[/caption] That’s the main way we expect people to use VR SLI. But VR SLI is even more than that—it’s really a DirectX extension API that allows engine developers explicit control over how work is distributed across any number of GPUs. So if you’re a developer and you want to support 4 GPUs or even 8 GPUs in a machine, you can do it. The power is in your hands to split up the work however you want, over however many GPUs you choose to support. As mentioned, VR SLI operates as a DirectX extension. There are two main ways it can be used. First, it enables GPU affinity masking: the ability to mask off which GPUs a set of draw calls will go to. With this feature, if an engine already supports stereo rendering, it’s very easy to enable dual-GPU support. All you have to do is add a few lines of code to send all the left eye’s draw calls to the first GPU, and all the right eye’s draw calls to the second GPU. For things like shadow maps that will be used by both GPUs, you can send those draw calls to both GPUs. It really is that simple, and incredibly easy to integrate in an engine. The other way to use VR SLI is to use GPU broadcasting. This requires a deeper integration and more work on the part of developers, but it has benefits not only in GPU performance, but CPU performance as well. CPU performance is important because once you’ve split your game’s rendering work across two GPUs, the CPU becomes the next most likely bottleneck to be impairing a game’s performance. GPU broadcasting allows you to render both eye views using a single set of draw calls, rather than submitting entirely separate draw calls for each eye. Thus, it cuts the number of draw calls per frame—and their associated CPU overhead—roughly in half. This works because the draw calls for each eye are almost completely the same to begin with. Both eyes can see the same objects, are rendering the same geometry, with the same shaders, textures, and so on. So from the driver’s point of view, it’s just doing the same work twice over. The only difference between the eyes is their view position—just a few numbers in a constant buffer. VR SLI enables you to submit your draw calls once, broadcasting them to both GPUs, while also sending different constant buffers to the two GPUs so that each GPU gets its correct eye position. This lets you render both eyes with hardly any more CPU overhead then it would cost to render a single eye. I’ve discussed two different ways to use VR SLI—affinity masking and broadcasting—but note that you don’t have to choose between them. Both styles of use can easily be mixed within the same application. So it’s easy for engine developers to start out using affinity masking, then gradually convert portions of their engine over to broadcasting as and when they need to. However, any level of VR SLI support does require active integration into an engine—it’s not possible to automatically enable VR SLI in games that haven’t integrated it. Since VR SLI is a software (driver) feature, it works across a wide variety of NVIDIA GPUs, going all the way back to the GeForce GTX 600 series. Currently it exists as an extension to DX11. DX12 already includes very similar multi-GPU support built in. We hope to eventually expose VR SLI extensions for OpenGL and Vulkan, as well as bringing VR SLI to Linux further down the road. VR SLI is a great feature to get maximum performance out of a multi-GPU system. But how can we make VR rendering more efficient even on a single GPU? That’s where my next topic comes in. [button color="orange" size="large" type="3d" target="" link="https://www.roadtovr.com/nvidia-takes-the-lid-off-gameworks-vr-technical-deep-dive-and-community-qa/2/"]Page 2 - Muti-Resolution Shading[/button] Multi-Resolution Shading One feature common to all the emerging VR headsets is that they have lenses between you and the screen. The lenses give you a wider field of view and enable you to focus comfortably, but they also cause significant distortions in the image. To compensate for this, the VR runtime software performs a counter-distortion: it takes frames rendered out from a game, and warps them into a nonlinear, oddly curved format before presenting it on the display. When you view this distorted image through the lenses, it appears normal. The problem is that this warping pass significantly compresses the edges of the image, while leaving the center alone or even slightly expanding it. In order to get full detail at the center, you have to render at a higher resolution than the display—only to throw away most of those pixels during the warping pass. Ideally, we would render directly to the final warped image. But GPUs can’t handle this kind of nonlinear distortion natively—their rasterization hardware is designed around the assumption of linear perspective projections. However, NVIDIA Maxwell GPUs, including the GeForce GTX 900 series and Titan X, have a hardware feature called “multi-projection” that enables us to very efficiently rasterize geometry into multiple viewports within a single render target at once. These viewports can be arbitrary different shapes and sizes. So, we can split our rendered image up into a few different viewports, keeping the center one its usual size, but shrinking the outer ones. This effectively forms a multi-resolution render target that we can draw into with a single pass, as efficiently as an ordinary render target. That, in a nutshell, is what we call multi-resolution shading. It allows us to effectively render at full resolution in the center of the image, and reduce resolution in the periphery, in one pass. This better approximates the shading rate of the warped image that will eventually be displayed—in other words, it avoids rendering a ton of extra pixels that weren’t going to make it to the display anyway, and gives you a substantial performance boost for no perceptible reduction in image quality. Like VR SLI, multi-res shading will require developers to integrate the technique in their games and engines—it’s not something that can be turned on automatically by a driver. Beyond simply enabling multi-res rendering in the main pass, engine developers will also have to modify any postprocessing passes that operate on the multi-res render target, such as SSAO, deferred shading or depth of field effects. We are working with a few key partners to develop multi-res shading, and are seeing some very promising initial results. [button color="orange" size="large" type="3d" target="" link="https://www.roadtovr.com/nvidia-takes-the-lid-off-gameworks-vr-technical-deep-dive-and-community-qa/3/"]Page 3 - Direct Mode and Front Buffering[/button] For Headset Developers VR SLI and multi-res shading are features targeted mainly at developers building engines for VR games. But GameWorks VR also includes a few more low-level features designed for VR headset developers to take advantage of in their own platform software. Direct Mode and Front Buffer Rendering Anyone who’s worked with a VR headset has probably had the experience of their system naively treating it like just another monitor: extending the desktop onto it, randomly moving application windows to it, causing screen flashes when you plug it in, and so on. But a VR headset isn’t just another monitor, and treating it as such just creates unnecessary friction for users. With Direct Mode, the display driver recognizes a headset when it’s plugged in and hides it from the operating system, preventing all those user-experience problems while still allowing VR apps to render to the headset. This also has some handy side effects: we can ensure we’re always in fullscreen-exclusive mode on the headset without any interference from the OS, and we can offer VR headset developers precise control over frame queuing and vsync behavior. That not only improves the user experience, it helps ensure that the latency between rendering and display scan-out is as low as possible. Another benefit of Direct Mode is that we can expose the ability to render directly to the front buffer—i.e. the buffer currently being scanned out to the display. Although it takes advanced low-level know-how to make use of this feature, it can help reduce latency still further, using tricks like rendering during vblank or racing the beam. Context Priorities To provide the most comfortable VR experience, games would ideally run perfectly consistently at 90 FPS, never missing a single frame. Unfortunately, with multitasking operating systems, memory paging, background streaming, and so forth, there are a lot of things going on that can cause even a perfectly optimized game to occasionally stutter or hitch. Missed frames are annoying even when playing on a regular screen, but in VR they’re incredibly distracting and even sickening. To help fight this problem, we enable the option to create a special high-priority D3D11 device in our driver. When work is submitted to this device, it preempts whatever else is running on the GPU, switches to the high-priority context, then goes back to what it was originally doing afterward. This supports features like async timewarp, which can re-warp the previous frame if a game ever stutters or hitches and doesn’t deliver a new frame on schedule. That’s GameWorks VR! NVIDIA is really excited about the future of virtual reality, and is working hard to bring the fastest GPUs and best tools for developers to build amazing VR experiences. For developers interested in receiving our SDK, they can learn more and register at developer.nvidia.com/vr. [button color="orange" size="large" type="3d" target="" link="https://www.roadtovr.com/nvidia-takes-the-lid-off-gameworks-vr-technical-deep-dive-and-community-qa/4/"]Gameworks VR Community Q&A[/button] Community Q & A When can we expect a public release of VR SLI capable drivers? Within the next few months. (However, drivers by themselves won’t automatically enable VR SLI—it will require integration into each game engine, so support for it will be up to individual developers.) Can you confirm what GPUs will receive what GameWorks VR features i.e. is it restricted to the 9xx series? Almost all of the features work all the way back to the 6xx series! The only one that requires the 9xx series is multi-resolution shading, because that depends on the multi-projection hardware that only exists in Maxwell, our latest GPU architecture. For the professional users out there, GameWorks VR is also fully supported on our corresponding Quadro cards. Of course, VR rendering is pretty demanding, so we recommend GeForce GTX 970 or higher performance for folks looking to get their PCs ready for VR. Can you comment on the recently uncovered patent applications RE: an NVIDIA VR headset? (link) NVIDIA regularly files patents across a wide range of graphics and display-related fields. Can you explain the differences between Gameworks VR and VR Direct - will one replace the other? GameWorks VR is the new name for what we previously called VR Direct. As we’ve grown the feature set of our VR SDK, it made sense to roll the capabilities into our overall GameWorks initiative. A lot of people are banking on SLI working in VR as a solution to the extremely high performance requirements. What are the chances of you getting one-GPU-per-eye latency down to the same levels as single card setups? We already have! In our VR SLI demo app we can see one-GPU-per-eye working without increasing latency at all relative to a single-GPU system. Any plans for GameWorks VR features for Linux operating systems? Eventually yes, but that’s further down the road. Regarding Foveated Rendering and VR SLI: do you have plans to enable SLI with two or more graphics cards or GPUs from different performance sectors? Like an enthusiast GM200/GP100 for the center area and a lower specced GM206/GPxxx for the peripheral vision? Any plans for a special VR dual GPU card that works like this? SLI requires all the GPUs to be the exact same model, so no, this isn’t something we can do currently. Heterogeneous multi-GPU (i.e. different classes of GPU working together) is potentially interesting as a research project, but not something we are actively looking at right now. What are your long term plans to increase GPU utilization without increasing latency? Not sure I understand this question exactly, but we’re constantly making driver improvements so that we can keep the GPU fed with work and not have to wait for the CPU unnecessarily, and we’re improving multitasking and preemption support with each new GPU architecture as well. Any plans for LOD based rendering being stereoscopic close and monoscopic after a certain distance? This is up to the individual game developer to implement if it makes sense for them. I see you have added monitor detection for Headsets, do you intend to extend this to Plug and Play VR Setup, so when you plug the headset in, its resolution and capabilities can be passed down like plug and play? As mentioned, we’re implementing Direct Mode so that headsets are recognized and treated as such instead of as a desktop monitor, but beyond that it’s up to the runtime/drivers provided by the headset maker to communicate with applications about the headset’s capabilities. How big is the VR R&D team at NVIDIA? It’s a bit difficult to say because there isn’t really one “VR team”, there are people focusing on VR across nearly every organization in the company. It’s a big initiative for us. Can NVIDIA make a driver better than Morgan Freeman in Driving Miss Daisy? It’s a huge challenge to make a driver as smooth as Morgan Freeman, but we’re sure as hell going to try! Our thanks to Nathan Reed for taking the time to work on this guest piece for us and to the users of subreddit /r/oculus for providing the questions.