Ashes dev dishes on DX12, AMD vs. Nvidia, and asynchronous compute
When Ashes of the Singularitylaunched two weeks ago, it gave us our first view of DirectX 12’s performance in a real game. What was meant to be a straightforward performance preview was disrupted by a PR salvo from Nvidia attempting to discredit the game and its performance results. Oxide Games refuted Nvidia’s statements about the state of Ashes, but the events raised questions about the state of Nvidia’s DX12 drivers and whether its GPUs were as strong in DirectX 12 as they have been in DirectX 11. (Oxide itself attributed these differences to driver maturity, not any fundamental quality of either GPU family). Now, an unnamed Oxide employee has released some additional information on both the state of Ashes and the reason why AMD’s performance is so strong.
According to Kollock, the idea that there’s some break between Oxide Games and Nvidia is fundamentally incorrect. He (or she) describes the situation as follows: “I believe the initial confusion was because Nvidia PR was putting pressure on us to disable certain settings in the benchmark, when we refused, I think they took it a little too personally.” Kollock goes on to state that Oxide has been working quite closely with Nvidia, particularly over this past summer. According to them, Nvidia was “actually a far more active collaborator over the summer then AMD was, if you judged from email traffic and code-checkins, you’d draw the conclusion we were working closer with Nvidia rather than AMD ;)”
According to Kollock, the only vendor-specific code in Ashes was implemented for Nvidia, because attempting to use asynchronous compute under DX12 with an Nvidia card currently causes tremendous performance problems:
“Personally, I think one could just as easily make the claim that we were biased toward Nvidia as the only ‘vendor’ specific code is for Nvidia where we had to shutdown async compute. By vendor specific, I mean a case where we look at the Vendor ID and make changes to our rendering path. Curiously, their driver reported this feature was functional but attempting to use it was an unmitigated disaster in terms of performance and conformance so we shut it down on their hardware. As far as I know, Maxwell doesn’t really have Async Compute* so I don’t know why their driver was trying to expose that.”
This type of problem, however, is why DX12, AMD and Nvidia drivers, and Ashes itself are all heavily qualified as being in early days. All of the companies involved are still working things out. It’s odd, however, that Nvidia chose to emphasize a non-existent MSAA bug in Ashes when they could’ve raised questions over asynchronous compute. It’s also worth noting, as Kollock does, that since asynchronous compute isn’t part of the DX12 specification, its presence or absence on any GPU has no bearing on DX12 compatibility.
Note: Nvidia has represented to ExtremeTech and other hardware sites that Maxwell 2 (the GTX 900 family) is capable of asynchronous compute, with one graphics queue and 31 compute queues. We are investigating this situation. It is not clear how these compute queues are accessed or what the performance penalty is for using them; GCN, according to AMD, is eight ACEs’ with eight queues each, for a total of 64 queues + a graphics queue.
Asynchronous compute, DX12, and GCN
Kollock writes that Ashes does take some advantage of asynchronous computing and sees a corresponding performance increase while using it, but that the work the team has done to-date is a fraction of what console developers may be building. Asynchronous computing is essentially useful for two types of work: It allows jobs to be completed on the GPU when the graphics card is idle (while waiting on the CPU, for example), and it allows tasks to be handled completely separately from the regular render workload. In theory, gameplay calculations can be sent to the ACEs while the GPU is busy with other tasks.
The author speculates that ACE’s used in this manner may have some similarities to Sony’s Cell, which was capable of enormous number-crunching performance if you optimized the code correctly and expects asynchronous compute to be increasingly important to future games:
“I think you’re also being a bit short-sighted on the possible use of compute for general graphics. It is not limited to post process. Right now, I estimate about 20% of our graphics pipeline occurs in compute shaders, and we are projecting this to be more then 50% on the next iteration of our engine. In fact, it is even conceivable to build a rendering pipeline entirely in compute shaders. For example, there are alternative rendering primitives to triangles which are actually quite feasible in compute… It’s quite possible that in 5 years time Nitrous’s rendering pipeline is 100% implemented via compute shaders.”
AMD has previously argued that its GCN architecture was well-suited to DX12 thanks to features like asynchronous compute, and this appears to confirm it. Exactly how much performance the feature delivers will require a great many more titles and finalized code, but it’s possible that the performance split between AMD and Nvidia will be quite different under DirectX 12 compared to DirectX 11.
Ever since the Xbox One and PS4 launched, we’ve looked for signs that the game optimizations that developers must be doing for GCN on consoles were making their way to the PC space. So far, there’s been little proof that owning the console market has helped PC gamers with AMD hardware — but that could be because PC games depended on DX11, which is an entirely different API with very different characteristics from DX12. Similarly, AMD’s asynchronous compute units weren’t very compatible with DX11 either, and saw little use.
If console developers are doing advanced offloading to bolster overall performance (since the Xbox One and PS4 aren’t exactly loaded for bear in the CPU department), then it’s possible that some of those advantages will finally come to the PC space, particularly on games optimized for Xbox One. The PS4’s API is said to be similar to Mantle or DX12 in some particulars, but the Xbox One will use DX12 itself.
We’re not going to draw any early conclusions from such narrow data, but the next 12-18 months should provide evidence one way or the other. As DX12 rolls out to Xbox One, we’ll either see an uptick in the number of games with better GCN optimizations in DX12, or we won’t. Either way, DirectX 12 gives developers far more control over performance tuning and optimizations than DX11 did, and that should help level the playing field between AMD and Nvidia, at least temporarily.