Cooperative Vectors in DirectX to use Blackwell Neural Shaders

Ability to integrate neural networks into shaders coming to DirectX

Nvidia recently talked new features for GeForce graphics cards – primarily the RTX Remix modding platform leaving beta and first games using Nvidia ACE. The company has another announcement: Neural Shaders, one of the architectural innovations in Blackwell GPUs, will be coming to DirectX. Microsoft is adding a Cooperative Vectors function to this API, which GeForce RTX 5000 series will support precisely through their Neural Shaders.

Neural Shaders represent one of the new features introduced by the Blackwell GPU architecture, consisting of closer integration between tensor cores (Nvidia’s AI accelerators in GPUs) and general-purpose compute units (“shaders”). While in older Nvidia GPUs it was difficult to use code (or it performed worse) that would simultaneously utilize general shader units and AI acceleration on tensor cores, Blackwell makes this possible.

Nvidia refers to this capability as “Neural Shaders.” Their application allows integration of a simple neural network into a shader running on a GPU, which can replace certain conventional algorithms through AI inference. Nvidia proposes using such small integrated AI models for tasks like simulating complex materials (such as skin with subsurface scattering effects), calling this Neural Materials, or for illumination simulation (Neural Radiance Cache).

Cooperative Vectors

The implementation of similar techniques will be possible in DirectX through a technology Microsoft has named Cooperative Vectors. However, it is not the same thing as Nvidia’s Neural Shaders. Cooperative Vectors is a software feature or technology presented by DirectX (and presumably there will be some equivalent in Vulkan API as well), while Neural Shaders is a hardware feature that enables Blackwell GPUs to support it.

Neural Materials in the Nvidia Blackwell GPU architecture presentation

Cooperative Vectors are designed to allow the integration of AI components into traditional graphics pipelines (enabling “neural rendering techniques”), but with cross-platform support. This means these techniques should work across GPUs from different manufacturers, and Microsoft likely intends this to include gaming consoles (or more precisely Xbox, for which DirectX is relevant).

Cooperative Vectors enable the use of matrix compute operations with vectors of arbitrary length within shaders. Matrix operations are precisely what AI acceleration relies on. Through Cooperative Vectors technology, AI inference can be integrated directly into, for example, a pixel shader. There’s no need to switch the GPU into any special AI acceleration mode. This shader will run on the GPU as usual and can be executed alongside other standard graphics operations simultaneously. Access to AI functions in this form should be significantly simpler than the options preset before it (though typically this approach will likely be used for smaller, simpler AI models rather than “LLMs”).

Support expected from all GPU manufacturers

Cooperative Vectors are reportedly getting support not just in Nvidia GPUs, but also by AMD, Intel, and Qualcomm GPUs (which means ARM Windows platform). Current reports suggest Nvidia’s support begins with the GeForce RTX 5000 series featuring Blackwell architecture, implying older GeForce GPUs won’t support this innovation or perhaps not fully (according to Nvidia’s explanation, the company’s older GPUs require using CUDA or Compute Shader modes to utilize tensor cores, while Blackwell allows tensor cores to be accessed directly from pixel shaders).

It’s possible this isn’t something strictly impossible on older GPUs – rather, the integration between shader programs and AI inference on tensor cores might be inefficient or come with performance penalties (think something similar to how asynchronous shaders worked on older GPUs like Nvidia Maxwell but were not helping).

For competing GPUs, we don’t yet know which generations or architectures will offer Cooperative Vectors support. Intel’s GPUs feature specialized XMX units, though it hasn’t been clarified whether these allow integration of AI acceleration in shader code. However, as mentioned, some Arc GPUs will support it.

For AMD graphics (RDNA 3 and RDNA 4), AI acceleration is closely tied to shader units since both are handled by the same hardware (AI acceleration runs via WMMA instructions). These GPUs might therefore have certain prerequisites for supporting Cooperative Vectors, though official confirmation is still pending.

Support for Cooperative Vectors will appear in the DirectX SDK this April, which is when it’s scheduled to be launched in the “preview” version. It will be part of HLSL, the language for writing shaders in DirectX. Currently, this support is mainly relevant for game developers rather than end users – it will take some time before neural rendering techniques utilizing these GPU capabilities ship in generally available games.

Sources: Microsoft, Nvidia

English translation and edit by Jozef Dudáš


  •  
  •  
  •  
Flattr this!

Better, more capable than expected: RDNA 4 architecture deep dive

Unofficial leaks from the past initially didn’t paint the RDNA 4 architecture as a major new design, suggesting that it’s more akin to RDNA 3 bugfix – except for new ray tracing units. But it turns out that was a big misconception, as RDNA 4 is a significant upgrade that leaves no GPU subsystems untouched, far beyond just adding new ray tracing units. It also brings enhanced AI acceleration and redesigned compute units (shaders). Read more “Better, more capable than expected: RDNA 4 architecture deep dive” »

  •  
  •  
  •  

Nvidia boosts RTX Video Super Resolution performance, adds HDR

When Nvidia unveiled GeForce RTX 5000 graphics in January, various new features were presented (though not all of them are exclusive to these new GPUs), most notably DLSS 4 able to generate more interpolated frames. We’ve devoted a separate article to Blackwell’s features, but now that the GPUs have started selling (albeit in limited quantities), we see that that are some additional new features that have flown under-the-radar before. Read more “Nvidia boosts RTX Video Super Resolution performance, adds HDR” »

  •  
  •  
  •  

Blackwell: GeForce RTX 5000 architecture and innovations [Analysis]

Nvidia’s new graphics cards – the GeForce RTX 5090 and RTX 5080 – won’t be out until the 30th, but NDA is over and the first reviews of the top-of-the-line RTX 5090, which we also tested, are out. In this article, we take a look at the Blackwell architecture that powers these new GPUs, its new features and functions. DLSS 4, compute unit architecture and features of the GPUs as well as the software side of this new generation. Read more “Blackwell: GeForce RTX 5000 architecture and innovations [Analysis]” »

  •  
  •  
  •  

Leave a Reply

Your email address will not be published. Required fields are marked *