News Vulkan makes it's debut with DOOM



We also anticipate some older GPUs will now be able to play the game at good framerates. We hope the range of GPU support widens with additional game and driver updates. That said, this is the first time a triple-A game is releasing on a brand-new API and brand-new drivers so there may be a few bumps, but our testing is showing really great performance and stability.

Read more.

People with AMD cards are quoting 20 to 40% improvements, even HD 7950 cards performing well into 60fps at 1080p. I haven't seen much from Nvidia users. This requires the latest game update and the latest drivers to work, along with setting rendering mode from OpenGL to Vulkan.

Those performance increases are astounding, it's basically two tiers higher in graphics performance, for free !
 
  • Like
Reactions: Soul_Est

Kmpkt

Innovation through Miniaturization
KMPKT
Feb 1, 2016
3,382
5,936
How many games actually support Vulkan at present? That is pretty damned impressive though.
 

Kmpkt

Innovation through Miniaturization
KMPKT
Feb 1, 2016
3,382
5,936
Nice to see AMD gaining a market advantage with Vulkan. I can only dream that this will force Microsoft and Nvidia to stop being douchebags.
 
  • Like
Reactions: Phuncz and Soul_Est

Phuncz

Lord of the Boards
Original poster
SFFn Staff
May 9, 2015
5,943
4,952
Less douchebaggery is indeed something we need in the game production world. Too many crappy console ports, UWP vendor lockin, GameWorks unfair competition and one-sided optimizations. To think Nvidia was the one to promote DX12 years ago as the revolution, yet they still haven't figured out Async Compute with Pascal, while AMD's cards from three generations earlier are reaping the benefits from Vulkan right now which uses Async Compute quite heavily I've read.
 

EdZ

Virtual Realist
May 11, 2015
1,578
2,107
Async Compute has been latched onto by the internet pundits, but all it is is a method to parallellise workloads (basically moving scheduling from software to hardware). If you already have a method to assign work that works without Async Compute, then adding it isn't going to speed anything up. Maxwell already had per-job variable partitioning, so unless you have a workload where the distribution of work varies within a job then Pascal's dynamic partitioning (orAsync Compute) gains nothing. In terms of dynamic partitioning 'vs.' Async Compute, the only difference is the order of completion of operations: at the end of the job you have the same operations completed in the same time, just in a different order.

GameWorks is mostly just sour grapes that Nvidia are providing pre-optimised libraries, and AMD are not. If you're a developer and you have the option to
a) write everything yourself and optimise for multiple architectures from multiple vendors
b) write everything yourself and do no optimisation
c) use pre-written code optimised for half or more of your install base
Then option C is very attractive.
Developers aren't stupid, and Nvidia can't force anyone to implement GameWorks. That so many games implement it is a good indication that there is an advantage in doing so.
 

Phuncz

Lord of the Boards
Original poster
SFFn Staff
May 9, 2015
5,943
4,952
Wasn't that the whole idea about Async Compute versus Nvidia's HyperQ, getting rid of the vendor-specific aspect to avoid more dissonance between the market leader and the competition ?

While indeed your point about GameWorks being an easy choice for a game developer*, this is just bad for consumers in practice. While they (game developers, Nvidia) promise better visuals and realism, in practice they are very obviously snuffing out competition and even their own old GPUs. There aren't many games that have more than a few GameWorks libraries integrated and perform well with everything but the latest GPUs. Doom is a good example of this, being able to hit good framerates even on older hardware.

I don't have a problem with Nvidia GTX cards, but I do with Nvidia's corporate strategy doing everything in their power to abuse their market position towards monopoly. It's open technologies like Vulkan (cross-platform) that should be embraced, not the focus on one particular vendor of hardware or software.

* No, developers aren't stupid, but they are following the orders being handed down from managers, Chiefs and shareholders that only have costs, income, profit and loss in mind. They don't care about monopoly positions because that's not in their project's time window and thus don't care about their customers in any other way than getting their money. Case in point: Assassin's Creed Unity, Batman Arkham Knight, Watch Dogs. I remember a time when a company couldn't get away with obvious money grabs like these.
 
Last edited:
  • Like
Reactions: alamilla

EdZ

Virtual Realist
May 11, 2015
1,578
2,107
Wasn't that the whole idea about Async Compute versus Nvidia's HyperQ, getting rid of the vendor-specific aspect to avoid more dissonance between the market leader and the competition ?
Hyper-Q is confined to CUDA code, and deals with parallellism within CUDA programs written for certain legacy standards. It's not got much to do with executing compute and graphics shaders simultaneously (e.g. if you are using compute shaders written in OpenCL, Hyper-Q is irrelevant).

While they (game developers, Nvidia) promise better visuals and realism, in practice they are very obviously snuffing out competition and even their own old GPUs.
When various exposes on older Nvidia GPU's 'being crippled' by updates turn up, they generally boil down to newer GPUs getting faster withupdates and older GPUs having the same performance. This is down to lifetime optimisation being topped out (i.e. she's giving' it all she's got, cap'n!) rather than reducing performance of older cards, and to features implemented in newer cards being used that were absent on older cards.

Case in point: Assassin's Creed Unity, Batman Arkham Knight, Watch Dogs. I remember a time when a company couldn't get away with obvious money grabs like these.
That's more down to garden-variety corporate imcompetance rather than deliberate performance sabotage. e.g. Gears of War or Arklham Knight being down to poor coding on the developer's end (due to lack of sufficient time and funding).
 

Phuncz

Lord of the Boards
Original poster
SFFn Staff
May 9, 2015
5,943
4,952
Hyper-Q is confined to CUDA code, and deals with parallellism within CUDA programs written for certain legacy standards. It's not got much to do with executing compute and graphics shaders simultaneously (e.g. if you are using compute shaders written in OpenCL, Hyper-Q is irrelevant).
I was under the assumption Hyper-Q was the name of the scheduler used by Nvidia. My bad.

When various exposes on older Nvidia GPU's 'being crippled' by updates turn up, they generally boil down to newer GPUs getting faster withupdates and older GPUs having the same performance. This is down to lifetime optimisation being topped out (i.e. she's giving' it all she's got, cap'n!) rather than reducing performance of older cards, and to features implemented in newer cards being used that were absent on older cards.
Techniques like HairWorks also crippled Kepler GPUs because of the insane increase in tesselation usage and lack of tesselation performance in anything but Maxwell and now Pascal. You're confusing my point with GameWorks supporting new effects and techniques that don't increase performance, but cost performance, at very different amounts, almost crippling performance for relatively small graphical fidelity increases.

That's more down to garden-variety corporate imcompetance rather than deliberate performance sabotage. e.g. Gears of War or Arklham Knight being down to poor coding on the developer's end (due to lack of sufficient time and funding).
While most likely true, with Batman Arkham Knight they did have time (or priority) to implement the GameWorks features for the PC versions that weren't possible on consoles, while not even optimizing for the most basic of fluent gameplay on most people's PCs. So they didn't want a game that would run properly, but they did need to spend time implementing those specific features ? My point is still valid that GameWorks doesn't benefit the gamers in general because it's not just more eye-candy, it comes at a serious cost in many ways.
 

EdZ

Virtual Realist
May 11, 2015
1,578
2,107
While most likely true, with Batman Arkham Knight they did have time (or priority) to implement the GameWorks features for the PC versions that weren't possible on consoles, while not even optimizing for the most basic of fluent gameplay on most people's PCs.
That's kind of the point of Gameworks: drop-in features that take essentially no effort for the developers to implement. If the rest of the game has had no effort put into porting and optimising there's not much it can do to help that.

My point is still valid that GameWorks doesn't benefit the gamers in general because it's not just more eye-candy, it comes at a serious cost in many ways.
That cost of often down to specific implementation. For example: Hairworks as implement din The Witcher 3 was stuck at x64 tessellation, an entirely unnecessarily extreme amount. If the tessellation factor had been exposed as a user setting, or set more sanely to start with by CDPR. And indeed, in patch 1.07 CDPR added a user setting for the tessellation factor.
 

EdZ

Virtual Realist
May 11, 2015
1,578
2,107
Came across an excellent description of Async Shaders vs. Async compute on Maxwell/Pascal & GCN, I'll repost it here:
They don't have dedicated async shaders, period. Async compute (the ability to decouple shading from rendering tasks) is supported, even in Maxwell. Maxwell: each GPC can work on different tasks, independent of other GPCs. Pascal: each SM (SMs are contained in a GPC) can work on different tasks, independent of other SMs. These tasks can be a mix of render and compute and the scheduler dynamically assigns tasks.

AMD has dedicated async shaders where they can offload compute, completely independent of the render pipeline.

It's two different approaches to accomplishing the same thing (decouple compute from render, to better utilize GPU). Async compute itself is NOT a magic bullet, and must be done properly other avoid stalls and can have disastrous effects if done poorly. It's a tool to offload compute tasks to reduce frame times, and like any other tool it can be a detriment if done wrong.

AMD couldn't use the ACEs in DX11, and their compute pipeline suffered because of it (a good chunk of hardware sitting unused). In Vulkan / DX12, they can now and get good performance gains from better GPU utilization. It's basically as simple as saying "throw into compute queue" and done.

Maxwell and Pascal are already near fully utilized. Forcing it on in Maxwell causes inefficiencies (performance loss) since the software scheduler is already dynamically assigning tasks and now you're overriding what it does. Pascal is better, but mostly just because it can be forced to reassign tasks in a more granular manner (by SM, not by entire GPC).

So yes, Nvidia can "do" async compute, no they don't have async shaders, no there is no performance gain on Maxwell (sometimes a detriment), yes Pascal can see minor gains.

Async shaders are a FEATURE of GCN, but they are not necessary to do async compute.
>The best thing i see coming from vulkan and dx12 are the increased drawcalls and low cpu overhead

Async compute is nearly entirely irrelevant for this. The important part of DX12/Vulkan for increasing draw calls and lowering CPU overhead are the multi-threaded command lists, which Nvidia has already supported since DX11 (hence their lower CPU overhead in DX11 compared to AMD). Leveraging DCLs, you can offload tasks to async compute too, but the BIG benefit is from the multi-threaded rendering itself. Basically, DX12/Vulkan brings AMD to Nvidia's level in the overhead department.

>But nvidia's method seems like it would be inefficient on it

Like everything, it depends. Nvidia designed Maxwell (and Pascal's) software scheduler around lowering frametimes as much as possible by dynamically distributing workload across the GPCs/SMs. Nvidia's software contingent is leagues larger than AMD's, and they have spent a LOT of money to make their underlying algorithms and drivers speedy. Their choice was to dynamically distribute load using the software scheduler, and it's very fast at it. AMD's choice was to dynamically distribute load using a hardware scheduler, and when all the parts of the GPU can be used (AMD couldn't in DX11) it is very fast at it.

People that don't understand will say things like "hardware is ALWAYS faster" or "software is EMULATING" (implying it's bad), but great software can overcome bad hardware (I'm NOT saying AMD's hardware scheduler is bad!), and vice-versa.

As an example, look at the raw specs of the R9 390 vs the GTX 970.

Code:
Card     GFlop  Render (Gp/s)
R9 390   5120   64
GTX 970  3494   54.6
So you have the GTX 970, with almost 50% less shading capability and ~10Gp/s less render capability, yet in DX11 the 970 is trading blows with the 390. Why? Because the 390's massive compute capability cannot be fully utilized in DX11.

But, in DX12 the tables turn - and the 390 is now beating the 980. Why? Because DX12 can leverage more of the compute resources on the 390, and the 390's raw compute capability is higher than the 980. It's as simple as that really.

Async compute is not a magic bullet fix-all "go faster!" performance boost unless your card was underutilized already. And that's exactly what is happening for AMD - the performance that has been locked away behind the shackles of a mediocre DX11 driver, and the inability to use ACEs, is now flying free! Imagine only being able to use 70% of your GPU's raw capability, and now using closer to 95%. That's what Vulkan / DX12 give AMD.
 
Last edited:

Phuncz

Lord of the Boards
Original poster
SFFn Staff
May 9, 2015
5,943
4,952
Very informative piece. So basically, Nvidia has the proper context switching and task scheduling in order (since Maxwell I presume), now AMD has caught up and are finally able, when Asnyc Compute is implemented correctly, to also utilize this performance.
 

GuilleAcoustic

Chief Procrastination Officer
SFFn Staff
LOSIAS
Jun 29, 2015
2,984
4,421
guilleacoustic.wordpress.com
As a developper with a game project, I'm really tempted to go with AMD. Summer sales are running and I'm following a Firepro W8100 and a Firepro W9100. If nobody buy it before it reaches my threashold price then I'll get it ... else it'll be an RX480.

EDIT: Also worth noting, GTX1080 are stuck with openCL 1.2 while the RX480 has openCL 2.2 support (and openCL 2.0 for the W8100/W9100).
 
Last edited: