% /u/sy/beebe/src/benchmarks/x11perf/results/README, Sat Apr 2 12:18:19 1994 % Edit by Nelson H. F. Beebe Remarks on X Window System Benchmarks [02-Apr-1994] Although one can often find numerical benchmarks that scale with clock rate in a particular CPU family, or which give reasonably consistent performance ratios between two different CPUs, this is definitely NOT the case with window system benchmarks, such as the 146 benchmarks and 222 results produced by the x11perf program. It is NOT possible to reduce the x11perf results to a single number that can be used to reliably predict relative performance of two systems. Architectural and instruction set differences, and on some systems, hardware assists for graphical operations, result in some operations being substantially more optimized than others. Between almost ANY pair of machines, it is possible to find an x11perf number which makes one machine beat the other. What benchmark measurements CAN do is detect cases where performance is unusually good, or unusually poor. If any single performance number is useful, it may be the harmonic mean of a large collection of relative performance values. The harmonic mean is the ratio of the number of results to the sum of the reciprocals of the results; it reduces the influence of large (good) results on the mean, and increases the influence of small (bad) results. Arithmetic and geometric means tend to overemphasize good results. Digital Review magazine chose five of the x11perf benchmarks as representative of typical workstation usage: -dot single dots -line10 short vectors -rect10 small filled rectangles -tr10itext text with a variable-width font -create-50-kids subwindow mapping Hewlett-Packard adopted that practice in their benchmark reports, and I have followed it too in the documents ftp.math.utah.edu:/pub/benchmarks/x11perf/x11perf-*. Those reports show bar charts of the above five tests for most of the major workstations. Unfortunately, selection of just 5 results discards information from 217 others, and fails to give an overall picture of what is happening. What may be useful is to collect complete x11perf results for a variety of machines, then to compare PAIRS of them in reports ordered by decreasing performance ratios with means and standard deviations. We present here a small selection comparing early 1994 entry-level workstations from DEC, HP, IBM, and Sun. Since the IBM RS/6000 Power PC seems to be the fastest of these, it is the one against which the others are compared. There are several reasons for the excellent performance of the Power PC, compared to the dismal performance of IBM RS/6000 Power systems (even the most expensive ones with attached 3-D graphics accelerators): * a change in the video-buffer bus architecture, * the addition of the Pixel Accelerator for X (PAX) chip to implement important graphics operations in hardware * addition of write-under-mask, avoiding slow read-modify-write video-buffer updates * direct frame buffer access, allowing video RAM to be addressed as normal memory * support for shared-memory X client-server communication (DISPLAY=:0), instead of socket communication (DISPLAY=name:0); this produces speedups of as much as to 6 to 8, with an average speedup of 1.65 (harmonic mean) The performance tables below are extracted from typescripts of the x11perf runs with a command like nawk -f two-test.awk typescript.machine-1 typescript.machine-2 >results To reduce the size of this report, only the top and bottom 10 of the 222 performance ratios are shown below; the nawk output contains the full story. The Digital Review 5 can be extracted by egrep -e '-dot |-line10 |-rect10 |-tr10itext |-create-50-kids ' results They are given following each report. Look for the lines beginning 'Machine 1' and 'Machine 2' to identify the machine pair in each report. ========================================================================= Comparison of x11perf benchmark results for two machines. Columns 1 and 2 are measured in operations/second and cannot be compared between benchmarks. The ratios in column 3 show how much faster machine 2 is than machine 1. For a single number representative of the average speedup, use the harmonic mean, PROVIDED that machine 2 is the faster. Machine 1 = typescript.ibm-rs6000-250-aix-3-2-X11R5-unix-socket Machine 2 = typescript.ibm-rs6000-250-aix-3-2-X11R5-shared-memory Benchmark count = 222 Sum = 454.553 Max = 8.63636 Min = 0.702778 Arithmetic mean = 2.04754 Harmonic mean = 1.65301 Geometric mean = 1.82085 Standard deviation = 1.19207 Variance = 1.42103 Root-mean-square = 2.36792 ========================================================================= x11perf benchmark Machine 1 Machine 2 Ratio 2:1 -fcircle1 198000.0 1710000.0 8.636 -osrect 208000.0 1330000.0 6.394 -rect1 263000.0 1670000.0 6.350 -seg10 197000.0 1240000.0 6.294 -putimage100 166.0 1020.0 6.145 -putimage500 7.7 46.9 6.091 -dseg10 212000.0 1260000.0 5.943 -circle1 179000.0 1050000.0 5.866 -tilerect1 228000.0 1330000.0 5.833 ... -movetree-75-kids 22300.0 20100.0 0.901 -circulate-16-kids 935.0 840.0 0.898 -resize-25-kids 1720.0 1540.0 0.895 -move-25-kids 1640.0 1460.0 0.890 -resize-16-kids 1950.0 1710.0 0.877 -movetree-50-kids 21600.0 18900.0 0.875 -move-16-kids 1890.0 1650.0 0.873 -movetree-25-kids 18900.0 16500.0 0.873 -movetree-16-kids 16700.0 14100.0 0.844 -ucirculate-200-kids 10800.0 7590.0 0.703 ========================================================================= -dot 426000.0 2280000.0 5.352 -line10 323000.0 1510000.0 4.675 -rect10 205000.0 689000.0 3.361 -tr10itext 269000.0 604000.0 2.245 -create-50-kids 4250.0 7460.0 1.755 ------------------------------------------------------------------------ ========================================================================= Comparison of x11perf benchmark results for two machines. Columns 1 and 2 are measured in operations/second and cannot be compared between benchmarks. The ratios in column 3 show how much faster machine 2 is than machine 1. For a single number representative of the average speedup, use the harmonic mean, PROVIDED that machine 2 is the faster. Machine 1 = typescript.dec-alpha-3000-300lx Machine 2 = typescript.ibm-rs6000-250-aix-3-2-X11R5-shared-memory Benchmark count = 222 Sum = 487.713 Max = 8.18991 Min = 0.512838 Arithmetic mean = 2.1969 Harmonic mean = 1.61029 Geometric mean = 1.85462 Standard deviation = 1.44455 Variance = 2.08671 Root-mean-square = 2.62749 ========================================================================= x11perf benchmark Machine 1 Machine 2 Ratio 2:1 -rect500 337.0 2760.0 8.190 -srect500 336.0 2710.0 8.065 -complex100 2060.0 16200.0 7.864 -ostrap10 8990.0 59900.0 6.663 -strap100 2980.0 19800.0 6.644 -rect100 7170.0 44700.0 6.234 -strap10 9750.0 60000.0 6.154 -tiletrap10 9730.0 59400.0 6.105 -srect100 7400.0 44400.0 6.000 ... -movetree-25-kids 20000.0 16500.0 0.825 -osrect500 313.0 255.0 0.815 -copyplane500 293.0 231.0 0.788 -circulate-200-kids 520.0 403.0 0.775 -tilerect500 329.0 255.0 0.775 -movetree-50-kids 24600.0 18900.0 0.768 -movetree-75-kids 26800.0 20100.0 0.750 -movetree-200-kids 29200.0 21100.0 0.723 -movetree-100-kids 28200.0 20000.0 0.709 -ucirculate-200-kids 14800.0 7590.0 0.513 ========================================================================= Digital Review 5: -line10 334000.0 1510000.0 4.521 -dot 643000.0 2280000.0 3.546 -tr10itext 191000.0 604000.0 3.162 -rect10 221000.0 689000.0 3.118 -create-50-kids 4490.0 7460.0 1.661 ------------------------------------------------------------------------ ========================================================================= Comparison of x11perf benchmark results for two machines. Columns 1 and 2 are measured in operations/second and cannot be compared between benchmarks. The ratios in column 3 show how much faster machine 2 is than machine 1. For a single number representative of the average speedup, use the harmonic mean, PROVIDED that machine 2 is the faster. Machine 1 = typescript.hp-9000-712-80i Machine 2 = typescript.ibm-rs6000-250-aix-3-2-X11R5-shared-memory Benchmark count = 222 Sum = 286.93 Max = 5.86957 Min = 0.2 Arithmetic mean = 1.29248 Harmonic mean = 0.992425 Geometric mean = 1.12861 Standard deviation = 0.824661 Variance = 0.680066 Root-mean-square = 1.53216 ========================================================================= x11perf benchmark Machine 1 Machine 2 Ratio 2:1 -complex100 2760.0 16200.0 5.870 -osrect 248000.0 1330000.0 5.363 -srect1 248000.0 1330000.0 5.363 -tilerect1 248000.0 1330000.0 5.363 -srect10 158000.0 687000.0 4.348 -strap100 4680.0 19800.0 4.231 -dline10 509000.0 1670000.0 3.281 -ellipse500 2430.0 6630.0 2.728 -dseg10 466000.0 1260000.0 2.704 ... -fspellipse100 15300.0 9000.0 0.588 -copypixwin100 3920.0 2120.0 0.541 -wcircle10 371000.0 199000.0 0.536 -circle10 371000.0 198000.0 0.534 -ucirculate-200-kids 21300.0 7590.0 0.356 -fcpellipse100 44000.0 13300.0 0.302 -dellipse100 16400.0 3890.0 0.237 -dcircle100 14800.0 3240.0 0.219 -ddellipse100 13900.0 2920.0 0.210 -ddcircle100 12300.0 2460.0 0.200 ========================================================================= Digital Review 5: -tr10itext 380000.0 604000.0 1.589 -line10 1400000.0 1510000.0 1.079 -create-50-kids 7890.0 7460.0 0.946 -dot 2570000.0 2280000.0 0.887 -rect10 787000.0 689000.0 0.875 ------------------------------------------------------------------------ ========================================================================= Comparison of x11perf benchmark results for two machines. Columns 1 and 2 are measured in operations/second and cannot be compared between benchmarks. The ratios in column 3 show how much faster machine 2 is than machine 1. For a single number representative of the average speedup, use the harmonic mean, PROVIDED that machine 2 is the faster. Machine 1 = typescript.sun-ss-lx-OW3.3 Machine 2 = typescript.ibm-rs6000-250-aix-3-2-X11R5-shared-memory Benchmark count = 222 Sum = 934.035 Max = 35.3066 Min = 0.594406 Arithmetic mean = 4.20736 Harmonic mean = 2.65195 Geometric mean = 3.29105 Standard deviation = 3.90658 Variance = 15.2613 Root-mean-square = 5.73537 ========================================================================= x11perf benchmark Machine 1 Machine 2 Ratio 2:1 -dline10 47300.0 1670000.0 35.307 -dseg10 49500.0 1260000.0 25.455 -circle1 47000.0 1050000.0 22.340 -dline100 14400.0 270000.0 18.750 -dseg100 14300.0 265000.0 18.531 -ddline100 13800.0 183000.0 13.261 -ddseg100 13700.0 181000.0 13.212 -seg10 102000.0 1240000.0 12.157 -bigtilerect500 17.8 201.0 11.292 ... -unmap-200-kids 122000.0 124000.0 1.016 -movetree-200-kids 21000.0 21100.0 1.005 -scroll500 126.0 124.0 0.984 -ostrap100 5540.0 5440.0 0.982 -tiletrap100 5550.0 5150.0 0.928 -copyplane500 285.0 231.0 0.811 -tilerect100 8720.0 6110.0 0.701 -osrect100 8780.0 6120.0 0.697 -osrect500 429.0 255.0 0.594 -tilerect500 429.0 255.0 0.594 ========================================================================= Digital Review 5: -line10 213000.0 1510000.0 7.089 -dot 329000.0 2280000.0 6.930 -tr10itext 95900.0 604000.0 6.298 -rect10 159000.0 689000.0 4.333 -create-50-kids 2510.0 7460.0 2.972 ------------------------------------------------------------------------