Page 52 - Data Science Algorithms in a Week
P. 52
36 Edwin Cortes, Luis Rabelo and Gene Lee
Wall Clock Time (elapsed wall time in seconds) is a measure of the real time that
elapses from start to end, including time that passes due to programmed (artificial) delays
or waiting for resources to become available. In other words, it is the difference between
the time at which a simulation finishes and the time at which the simulation started. It is
given in seconds.
Speedup Rel (Speedup Relative) is
T(Wall Clock Time for 1 Node for that time synchronization scheme)
T(Wall Clock Time for Nodes used for that time synchronization scheme).
Speedup Theoretical is based on the Simulation Object with the longest processing
time. It is the maximum (approximated) Speedup expected using an excellent parallelized
scheme (taking advantage of the programming features, computer configuration of the
system, and partitions of the problem).
PT (processing time) is the total CPU time required to process committed events, in
seconds. The processing time does not include the time required to process events that are
rolled back, nor does it include additional overheads such as event queue management
and messages.
Min Committed PT per Node is the Minimum Committed Processing Time per
Node of the computing system configuration utilized.
Max Committed PT per Node is the Maximum Committed Processing Time per
node of the computing system configuration utilized.
Mean Committed PT per Node is the Mean Committed Processing Time per node
of the Computing system configuration utilized.
Sigma is the standard deviation of the processing times of the different nodes utilized
in the experiment.
The benchmark for the different time management and synchronization schemes
(TW, BTB, and BTW) is depicted in Figure 7. TW has the best result of 2.9 (close to the
theoretical speedup of 3.0). BTW and TW are very comparable. BTW does not perform
well with this type of task for distributed systems. However, BTW has better
performance with the utilization of multicore configurations (i.e., tightly coupled) for this
specific problem.