Page 64 - Computer Power User - February 2017
P. 64
buffers, which is another major drawback
of CrossFire and SLI. As anyone who has
used SLI or CrossFire knows all too well, it
doesn’t always work, and sometimes it even
hinders performance compared to running
the game with a single graphics card.
Explicit Multi-Adapter in DX12
changes all that and brings along a
handful of new techniques for dividing
up the workload between all GPUs. Split-
frame rendering essentially divides each
frame into multiple tiles and distributes
the tile-rendering tasks to the different
adapters. Asymmetric multi-GPU is
another technique that lets a game divide
rendering tasks unequally, between a
discrete graphics card and an on-CPU Our SAPPHIRE RX 480 NITRO+ has four ACEs, with each one supporting up to eight queues.
graphics adapter. For instance, the
heavy lifting of the game rendering
would likely fall to the discrete GPU,
while something like lighting, physics,
or post-processing would be handled
by the integrated adapter. DX12 also
lets the game workload see the two or
more distinct pools of graphics memory
as a single combined pool, so two cards
with 4GB each let the DX12 API access
all 8GB.
Asynchronous Compute
There’s a lot of information on the
internet about Asynchronous Compute,
but depending on your sources, you’re
likely to come away with three different The GIGABYTE GTX 1060 G1 Gaming 6GD features Pascal’s revamped scheduler, which is designed to
impressions of how it works based on the better adapt to DX12 workloads.
terminology used by AMD, NVIDIA,
and Microsoft. The important takeaway is
that Asynchronous Compute is designed of cache, and a handful of specialized becomes necessary, developers also have
to eliminate GPU workload inefficiencies function units that handle post- more control over when and how that
that crop up primarily when gaming. processing duties. Giving developers a happens. These functions are good at
When AMD and NVIDIA come way to get more performance from a filling in the gaps compared to the more
up with new GPU architectures, most given bit of hardware, either in a console traditional render path of a modern GPU,
of the changes that occur from one or PC, is always a good thing. but it’s not a complete overhaul of the
generation to another are designed to According to Microsoft, DX12’s take process by any means. The performance
ensure that when the GPU is working, it on Asynchronous Compute is referred bump DX12 delivers for a game that uses
is working hard, with 100% utilization, to most often as Synchronization and this technique vs. the same game running
to give you the fastest frame rates, at Multi-Engine. In essence, this aspect of under DX11 tops out at about 20%, but
the highest resolution, with as much eye the API lets game developers employ we suspect this performance improvement
candy as your visual cortex can handle. queues and command lists to execute is restricted to the times when the GPU is
This is an incredibly tricky thing to do dozens, hundreds, or even thousands of at its most inefficient. In short, Microsoft’s
using a GPU that is largely comprised threads concurrently without having to Synchronization and Multi-Engine
of a hundreds or thousands of general- have those items pause to wait their turn. techniques are all about filling in the gaps
purpose units, with limited amounts When synchronizing the thread output to keep GPU utilization maxed out.
64 February2017 / www.computerpoweruser.com