Summary of Jingyi Zhu benchmarks

Quick overview

This document summarizes the benchmarks from Jingyi Zhu's research group in computational fluid dynamics at the University of Utah Mathematics Department.

The benchmark results are available in two series of tables.

Each table contains rows of results ordered in decreasing performance. Each row contains the

Important disclaimer

Please remember that there is no answer to the commonly-asked question: ``What is the fastest machine?''.

Within the collection of benchmarks of which these are members, it is often possible to pick a single benchmark which rates a particular machine the fastest, and yet, on other benchmarks, the same machine may perform poorly with respect to competing models.

Particularly on modern RISC architectures, performance can be extremely sensitive to the quality of compiler optimizations; in at least one case, a speedup of a factor of fifty was seen over a range of compiler options on the same system.

The benchmarking of these programs has investigated a substantial number of compilation options and optimization levels, but it is possible that new releases of compilers, or alternative compilers, might improve the results significantly. We make reasonable efforts to keep our compiler and operating systems up-to-date with vendor software releases, but particularly with older machine models, or machines obtained on a short-term loan for evaluation purposes, it is frequently impossible to rerun the benchmarks after such new releases.

It is imperative with computer benchmarking to examine a range of benchmark programs, where those programs are chosen to represent the kinds of numerical computation that are important to you, before coming to a conclusion about which machine is best for your jobs.

Many other factors besides benchmark performance should affect computer purchasing decisions, including at least these:

Brief benchmark descriptions

All of the benchmark programs described in this document are written in highly-portable Fortran 77, and all represent real research programs using real data; they are not loop kernels or toy implementations. Program code sizes are given below.

mgzhu

[2048 lines of Fortran code]

We solve the Navier-Stokes equations for incompressible flows here with two fluids with different densities, separated by a sharp interface. A second-order approximate projection method is used with the elliptic part solved by a multigrid method. Also the evolving front is dealt with by a level-set formulation, which is solved numerically by using some second-order upwind schemes.

Profiling with gprof on a Sun UltraSPARC 170 produced the following flat profile:

granularity: each sample hit covers 2 byte(s) for 0.02% of 49.57 seconds

   %  cumulative    self              self    total          
 time   seconds   seconds    calls  ms/call  ms/call name    
 29.9      14.84    14.84     1024    14.49    14.86  gsneu2_ [6]
 18.4      23.95     9.11      640    14.23    14.23  gsneu1_ [9]
  8.0      27.94     3.99        3  1330.02  1330.02  reinit_ [11]
  6.4      31.13     3.19       16   199.38   320.63  conjug_ [10]
  6.1      34.13     3.00      576     5.21     5.30  resid2_ [12]
  5.1      36.66     2.53      360     7.03     7.03  resid1_ [14]
  4.3      38.80     2.14        8   267.50   361.25  edgevl_ [13]
  3.9      40.74     1.94      143    13.57    13.57  atimex_ [15]
  1.5      41.48     0.74        8    92.50  1614.57  multi1_ [8]
  1.4      42.18     0.70        8    87.50  2687.82  projec_ [4]
  1.4      42.85     0.67        6   111.67   111.67  convec_ [20]
  1.2      43.46     0.61        8    76.25  2508.33  multi2_ [5]
  1.2      44.06     0.60       16    37.50    37.50  evalim_ [21]
  1.1      44.62     0.56        8    70.00  1690.82  macpro_ [7]
  1.1      45.15     0.53      512     1.04     1.04  intpl2_ [22]
  1.0      45.65     0.50        8    62.50    62.50  timrhs_ [23]
...

gsneu1 and gsneu2 do red-black Gauss-Seidel iterations, and each contains four sets of doubly-nested loops that compute elements of a single vector.

reinit contains seven sets of doubly-nested loops.

conjug solves a linear system by the unconditioned conjugate gradient method.

rdzhu

[5688 lines of Fortran code]

We solve the convection-reaction-diffusion system here to study the effective propagation of the chemical fronts, under the influence of the flow field. A nonlinear system discretized from the original equation is solved by the nksol package.

Profiling with gprof on a Sun UltraSPARC 170 produced the following flat profile:

granularity: each sample hit covers 2 byte(s) for 0.01% of 156.78 seconds

   %  cumulative    self              self    total          
 time   seconds   seconds    calls  ms/call  ms/call name    
 18.4      28.92    28.92     1348    21.45    21.45  fnonlm_ [9]
 17.5      56.30    27.38      148   185.00   514.32  lnsrch_ [5]
 10.6      72.94    16.64     2193     7.59     7.59  daxpy_ [11]
  8.0      85.48    12.54     1299     9.65     9.65  dswap_ [13]
  6.2      95.20     9.72     1011     9.61     9.61  dnrm2_ [14]
  5.9     104.45     9.25     1496     6.18     6.18  dcopy_ [15]
  5.8     113.61     9.16     1379     6.64     6.64  ddot_ [16]
  5.6     122.35     8.74      407    21.47    34.42  atv_ [12]
  5.4     130.84     8.49       49   173.27   173.27  convem_ [17]
  3.5     136.29     5.45      148    36.82    36.82  nkstop_ [18]
  3.4     141.56     5.27      407    12.95    12.95  jacobm_ [19]
  2.4     145.32     3.76      148    25.41   367.23  spigmr_ [8]
  2.4     149.03     3.71        1  3710.04 156718.72  MAIN_ [3]
  1.7     151.72     2.69       49    54.90  2917.89  nksol_ [4]
  1.0     153.22     1.50      407     3.69     3.69  dscal_ [21]
...

fnonlm contains nested loops that generate elements of a vector. lnsrch is a driver for the line search algorithm. The next five functions are BLAS routines.