Page 65 - Computer Power User - February 2017
P. 65

When talking about thousands of threads
         running in parallel, AMD’s architecture is
         the one that, at least on paper, appears to
         be better able to take advantage of DX12’s
         Multi-Engine approach. The AMD Radeon
         RX 480 has 2,304 stream processors,
         almost twice as many general-purpose
         processors such as the NVIDIA GeForce
         GTX 1060, which features 1,280 CUDA
         cores. Traditionally, multithreaded graphics
         workloads are synchronous, which means
         that there’s just one queue, and everything
         that passes through is scheduled and
         synchronized in advance.
           The Polaris architecture of AMD’s
         RX 400 Series GPUs (and most previous
         GCN-based GPUs) feature what AMD
         calls ACEs, or Asynchronous Compute
         Engines, which, when running in a DX12
         environment, give developers access to
         command lists that they can submit to
         three queues: graphics, compute, and   The API Overhead test, while interesting, doesn’t give us any idea how DirectX 12 games will perform
         copy. This opens up AMD’s ACEs to   in a real-world scenarios.
         significantly more of the heavy lifting of
         games, including rendering 3D objects,   giving game developers who work on   NVIDIA’s architecture yields the better
         calculating AI, and performing lighting,   multiplatform titles plenty of access to   results under lighter workloads; however,
         shadowing, and camera effects workloads.   AMD’s async shader architecture.   AMD’s architecture pulls ahead when
         The SAPPHIRE Radeon RX 480           You may have heard that NVIDIA   those command lists start churning out
         NITRO+ we used for our testing features   hardware can’t perform Asynchronous   more and more tasks to fill up those
         four ACEs, and each ACE supports up to   Compute, but that’s just not true. The   parallel queues. Although AMD got a
         eight queues.                      firm’s Maxwell and Pascal architectures   bit of a head start on NVIDIA, both
           To further illustrate AMD’s favorable   (GeForce GTX 900/1000 Series) are   vendors are tailoring their respective
         position with regard to DX12, it’s also   capable of letting command lists break   GPU architectures to enable DX12’s most
         important to point out that many consider   up tasks to run in multiple queues, but   promising features. Console and PC game
         AMD’s Mantle graphics API to have been   NVIDIA has a different bag of tools for   developers are diving in with both feet,
         the kick in the pants Microsoft needed   getting the job done. In Pascal, each SM   and the DX12 games we have access to
         to roll out DX12. Back when AMD was   (streaming multiprocessor) features a   today are nothing short of stellar.
         working on Mantle, the firm’s goal was to   geometry engine, with rasterizers shared
         create a 3D rendering and gaming-centric   by all SMs in a GPC (graphics processing   Putting DX12 To The Test
         API that lets developers get “closer to the   cluster). The GIGABYTE GeForce GTX   A year and a half ago, we had no real
         metal,” which in developer-speak means   1060 G1 Gaming 6GD we’re using in   option for testing DX12 with real-world
         unfettered access to the GPU, all of the   this article features three GPCs and 10   workloads. As a result, we were forced
         cores of the CPU, and the system memory.   SMs. Not a whole lot of the underlying   to rely on one of the few tests available
         Sound familiar? In 2015, AMD donated   structure of the GPU’s architecture   at the time, Futuremark API Overhead
         Mantle to the Khronos Group, and the   changed between Maxwell and Pascal,   feature test, which is designed to hammer
         organization renamed it Vulkan. 2016’s   but the latter now features a revamped   the system with draw calls per frame until
         DOOM is an example of a game that runs   scheduler designed to leverage Microsoft’s   the frame rate drops below a playable
         on either OpenGL or the Vulkan API.   Synchronization and Multi-Engine   30fps. On a test system that relied on a
           The role consoles play in the success of   strategy to improve SM utilization. Again,   decidedly midrange SAPPHIRE Radeon
         a gaming API should not be downplayed.   filling in the gaps for those times when   R9 285 graphics card with 2GB of
         AMD’s hardware is under the hood of the   GPU utilization drops slightly.  GDDR5 memory, our multi-threaded
         Xbox One, PS4, as well as the souped-up   In head-to-head scenarios, using largely   scores jumped from 655,000 draw calls
         PS4 Pro and Project Scorpio, which is   synthetic tests, it appears as though   per second to more than 16 million draw


                                                                                                 CPU  /  February2017 65
   60   61   62   63   64   65   66   67   68   69   70