Page 473 - Maxwell House
P. 473

APPROACH TO NUMERICAL SOLUTION OF EM PROBLEMS                           453



            �(∆) + (∆) + (∆)  as ∆ ≤  . This limit is necessary but not sufficient to guarantee
                 2
                         2
                                2
                                          11
            stability. Roughly speaking, it means that the time step must be kept small enough so that
            information has enough time to propagate through the space discretization. There are several
            variants of FDTD that are free from this restriction but at the cost of more complex code with
            more bug probability.
            •  Can handle up to hundred million cells’ problem giving a solution in minutes on high-
            performance  computers allowing the simultaneous  use of  multiple  computer  resources  (for
            example,  the  distributed  multi-processor  computing  with  effective  resource management).
            However, the mere increase in the number of processors above some limit is not justified. More
            processors mean the greater time waste on the data exchange between processors that can slow
            down the simulation procedure. Meantime, the simulations to be completed over 35 times faster
            if the computer  is equipped  with  Graphical Processing  Unit (GPU) processor, which is
            specifically designed to handle large amounts of graphical data in parallel. A modern GPU has
            several hundred small processors that can work in parallel. The experiment demonstrates that
            GPU processor combined with CPU cache memory usage may accelerate simulations from
            1500 to 21000 Mcells/s compared to non-accelerated standard multi-core workstation speeds
            ranging from 20 to 200 Mcells/s. One of such ultra-fast and powerful commercial FDTD tools
                          12
            is EMPIRE XPU .
            • More than a dozen specific and general purpose commercial simulators are available on the
            market.

            The primary drawbacks of FDTD technique are:
            • Since FDTD requires that the entire computational domain is gridded, and the grid spatial
            discretization must be sufficiently fine to resolve both the shortest EM wavelength and the
            smallest geometrical feature in the model. The consequence is large computational domains
            and relatively long solution times. In general, an FDTD requires 30 bytes of memory per Yee
            cell [3]. To estimate the total memory required, in bytes, just multiply the whole number of
            FDTD cells by 30. There is some overhead in the calculation, but it is generally quite small.
            • Run times in the order of hours, days, or even longer are common  when  solving
            electromagnetic waves problems of practical size. The simplest way to estimate this time is to
            multiply the total number of cells by the expected time steps and the factor 80 that describes
            the required amount of operations per cell and per time step. If a time duration of each floating-
            point operation is known, the whole execution time on a single processor can be projected. In
            general, though, a better estimating method is to determine the execution time of a simple
            problem on a given computer and then scale the time by the ratio of the number of operations
            between the desired calculation and the simple one.
            • Since FDTD simulations calculate the E- and H-fields at all points within the computational
            domain, the latter must be finite to permit its residence in the computer memory. As such, the
            Perfectly Matched Layer (PML) box is needed to properly truncate the spatial domain in case
            of exterior EM problems (see below Section 9.1.3).
            • Since the computational grids are typically rectangular, they do not conform nicely curved
            surfaces that cause the additional meshing inaccuracies. Irregular, non-orthogonal grids are




            11  Courant-Friedrichs-Lewy (CFL) criterion.
            12  Check http://www.empire.de/
   468   469   470   471   472   473   474   475   476   477   478