Page 65 - Computer Power User - February 2017
P. 65
When talking about thousands of threads
running in parallel, AMD’s architecture is
the one that, at least on paper, appears to
be better able to take advantage of DX12’s
Multi-Engine approach. The AMD Radeon
RX 480 has 2,304 stream processors,
almost twice as many general-purpose
processors such as the NVIDIA GeForce
GTX 1060, which features 1,280 CUDA
cores. Traditionally, multithreaded graphics
workloads are synchronous, which means
that there’s just one queue, and everything
that passes through is scheduled and
synchronized in advance.
The Polaris architecture of AMD’s
RX 400 Series GPUs (and most previous
GCN-based GPUs) feature what AMD
calls ACEs, or Asynchronous Compute
Engines, which, when running in a DX12
environment, give developers access to
command lists that they can submit to
three queues: graphics, compute, and The API Overhead test, while interesting, doesn’t give us any idea how DirectX 12 games will perform
copy. This opens up AMD’s ACEs to in a real-world scenarios.
significantly more of the heavy lifting of
games, including rendering 3D objects, giving game developers who work on NVIDIA’s architecture yields the better
calculating AI, and performing lighting, multiplatform titles plenty of access to results under lighter workloads; however,
shadowing, and camera effects workloads. AMD’s async shader architecture. AMD’s architecture pulls ahead when
The SAPPHIRE Radeon RX 480 You may have heard that NVIDIA those command lists start churning out
NITRO+ we used for our testing features hardware can’t perform Asynchronous more and more tasks to fill up those
four ACEs, and each ACE supports up to Compute, but that’s just not true. The parallel queues. Although AMD got a
eight queues. firm’s Maxwell and Pascal architectures bit of a head start on NVIDIA, both
To further illustrate AMD’s favorable (GeForce GTX 900/1000 Series) are vendors are tailoring their respective
position with regard to DX12, it’s also capable of letting command lists break GPU architectures to enable DX12’s most
important to point out that many consider up tasks to run in multiple queues, but promising features. Console and PC game
AMD’s Mantle graphics API to have been NVIDIA has a different bag of tools for developers are diving in with both feet,
the kick in the pants Microsoft needed getting the job done. In Pascal, each SM and the DX12 games we have access to
to roll out DX12. Back when AMD was (streaming multiprocessor) features a today are nothing short of stellar.
working on Mantle, the firm’s goal was to geometry engine, with rasterizers shared
create a 3D rendering and gaming-centric by all SMs in a GPC (graphics processing Putting DX12 To The Test
API that lets developers get “closer to the cluster). The GIGABYTE GeForce GTX A year and a half ago, we had no real
metal,” which in developer-speak means 1060 G1 Gaming 6GD we’re using in option for testing DX12 with real-world
unfettered access to the GPU, all of the this article features three GPCs and 10 workloads. As a result, we were forced
cores of the CPU, and the system memory. SMs. Not a whole lot of the underlying to rely on one of the few tests available
Sound familiar? In 2015, AMD donated structure of the GPU’s architecture at the time, Futuremark API Overhead
Mantle to the Khronos Group, and the changed between Maxwell and Pascal, feature test, which is designed to hammer
organization renamed it Vulkan. 2016’s but the latter now features a revamped the system with draw calls per frame until
DOOM is an example of a game that runs scheduler designed to leverage Microsoft’s the frame rate drops below a playable
on either OpenGL or the Vulkan API. Synchronization and Multi-Engine 30fps. On a test system that relied on a
The role consoles play in the success of strategy to improve SM utilization. Again, decidedly midrange SAPPHIRE Radeon
a gaming API should not be downplayed. filling in the gaps for those times when R9 285 graphics card with 2GB of
AMD’s hardware is under the hood of the GPU utilization drops slightly. GDDR5 memory, our multi-threaded
Xbox One, PS4, as well as the souped-up In head-to-head scenarios, using largely scores jumped from 655,000 draw calls
PS4 Pro and Project Scorpio, which is synthetic tests, it appears as though per second to more than 16 million draw
CPU / February2017 65