Page 480 - Maxwell House
P. 480
460 Chapter 9
The more efficient approach is
Control Data Multiple Instruction stream and
unit 1 processor 1 Multiple Data streams (MIMD)
architectures shown
Control Data schematically in Figure
Main unit 2 processor 2 9.2.2b . Such execution
17
memory corresponds to program
parallelization by code
decomposition among parallel
Control Data processors that allow achieving
unit n processor n approximately equal execution
time of the decomposed
program code parts. A
modern HPC is almost always
a cluster of MIMD machines,
Figure 9.2.2b Block-diagram of MIMD system
each of which implements
SIMD instructions.
While the concept of parallel computation is simple, it requires in practice specialized and
highly sophisticated software and algorithms engineered for this purpose. Fortunately, many
electrodynamics algorithms including FDTD
can be carried out as a parallel computation.
Figure 9.2.3 gives some guidance illustrating
18
the FDTD performance vs. memory
requirement on a different platform [11, 13].
9.2.4 GPU and Cache Acceleration
What may be your approach if you do not have
access to HPC? Do not panic. Modern
workstations and even laptops include
Figure 9.2.3 FDTD performance vs. multiple processing cores and Graphic
memory requirement Processing Units (GPUs). If there are no high-
performance GPUs in your computer, replace
the old ones with more modern ones or only add extra using the free computer slots. GPU
numerous cores can handle thousands of threads, i.e. the smallest sequence of programmed
instructions that can be managed independently. Threads are a way for a program to divide
itself into two or more simultaneously (or pseudo-simultaneously) running tasks. Typical
architecture of GPU consists of thousands of small cores, designed for handling multiple tasks
in parallel. That is exactly we need to realize the fast parallel processing. Therefore, EM solver
that take advantage of GPU should be your first choice. Currently, all the leading 3D EM
modeling and simulation packages such as CST STUDIO SUITE®, CST MICROWAVE
TM
STUDIO®, IMST - Empire XCcel, Altair - FECO, ANSYS – HFSS, Acceleware - AxFDTD ,
18 Public Domain Image, source:
www.lumerical.com/support/whitepaper/parallel_processing_fdtd_solutions.html