[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
UltraSparc kernel results
Guys,
I have finally gotten the new kernel/cleanup stuff working such that
if you hold its hand supportively enough, it'll complete a build for
you.  I include timings below on a Ultra-2, 200Mhz, comparing
Sunperf, last release of ATLAS (Atl) and ATLAS + the kernel submitted
by Viet Nguyen & Peter Strazdins.  The good news is we get about 90%
of performance for double complex, and we modestly beat the vendor for
double.  We still run only around 80-85% of vendor speed for single
precision (the submitted code doesn't help single).
That's the good news.  The bad news is I got access to an Ultra-5/10,
sun's PCI-based low-end ultrasparc, and the submitted kernels don't
seem to do very well on those machines; ATLAS's generated code is
as good as the kernel there, and both get *completely* waxed by
sunperf.  My guess is the motherboard can have such an effect
because the UltraSparc II has an off-chip cache, and the PCI-based
one makes the code really different . . .  Anyway, I'll have to 
investigate this further, maybe I just messed up the build . . .
Cheers
Clint
    LIB       N        DGEMM  DSYMM  SYR2K  DSYRK  DTRMM  DTRSM
=======    ====        =====  =====  =====  =====  =====  =====
Sunperf     100        271.3  157.7  163.3  163.3  101.4  190.2
Atl         100          
Atl+USK     100        255.5  239.7  158.8  158.8  152.8  210.8
Sunperf     500        277.8  238.1  287.4  266.5  164.5  235.8
Atl         500        245.1  235.8  238.1  235.8  227.3  240.4
Atl+USK     500        297.6  287.4  297.6  245.6  271.7  260.4
Sunperf    1000        288.2  222.2  269.9  253.8  159.2  248.8
Atl        1000        248.4  246.3  245.7  235.8  238.1  245.7
Atl+USK    1000        294.1  284.9  280.9  252.8  266.7  232.6
    LIB       N        ZGEMM  ZSYMM  SYR2K  ZSYRK  ZTRMM  ZTRSM  ZHERK  HER2K
=======    ====        =====  =====  =====  =====  =====  =====  =====  =====
Sunperf     500        300.3  292.4  288.2  219.7  281.2  263.4  273.8  286.5
Atl         500        251.9  245.7  247.5  213.7  233.6  205.8  214.6  233.6
Atl+USK     500        289.0  297.6  284.9  216.9  247.8  224.4  218.8  287.4
******************************************************************************
gemm timings with:
   0: sunperf
   1: generated atlas
   2: atlas with UltraSparc kernel
DGEMM:
TEST  TA  TB    M    N    K  alpha   beta    Time Mflop
====  ==  ==  ===  ===  ===  =====  =====  ====== =====
   0   N   N  100  100  100    1.0    1.0    0.01 263.2
   1   N   N  100  100  100    1.0    1.0    0.01 198.9
   2   N   N  100  100  100    1.0    1.0    0.01 239.7
   0   N   N  200  200  200    1.0    1.0    0.06 250.7
   1   N   N  200  200  200    1.0    1.0    0.07 231.7
   2   N   N  200  200  200    1.0    1.0    0.06 262.5
   0   N   N  300  300  300    1.0    1.0    0.19 276.9
   1   N   N  300  300  300    1.0    1.0    0.24 226.6
   2   N   N  300  300  300    1.0    1.0    0.20 276.9
   0   N   N  400  400  400    1.0    1.0    0.49 261.2
   1   N   N  400  400  400    1.0    1.0    0.55 230.6
   2   N   N  400  400  400    1.0    1.0    0.44 294.3
   0   N   N  500  500  500    1.0    1.0    0.98 255.1
   1   N   N  500  500  500    1.0    1.0    1.07 233.6
   2   N   N  500  500  500    1.0    1.0    0.86 290.7
   0   N   N  600  600  600    1.0    1.0    1.63 265.0
   1   N   N  600  600  600    1.0    1.0    1.86 232.3
   2   N   N  600  600  600    1.0    1.0    1.51 286.1
   0   N   N  700  700  700    1.0    1.0    2.59 264.9
   1   N   N  700  700  700    1.0    1.0    2.88 238.2
   2   N   N  700  700  700    1.0    1.0    2.55 269.0
   0   N   N  800  800  800    1.0    1.0    3.81 268.8
   1   N   N  800  800  800    1.0    1.0    4.24 241.5
   2   N   N  800  800  800    1.0    1.0    3.56 287.6
   0   N   N  900  900  900    1.0    1.0    5.53 263.7
   1   N   N  900  900  900    1.0    1.0    6.49 224.7
   2   N   N  900  900  900    1.0    1.0    5.06 288.1
   0   N   N 1000 1000 1000    1.0    1.0    7.59 263.5
   1   N   N 1000 1000 1000    1.0    1.0    8.72 229.4
   2   N   N 1000 1000 1000    1.0    1.0    7.24 276.2
ZGEMM:
TEST  TA  TB    M    N    K        alpha         beta    Time  Mflop
====  ==  ==  ===  ===  ===  ===== =====  ===== =====  ======  =====
   0   N   N  100  100  100    1.0   0.0    1.0   0.0    0.03  266.7
   1   N   N  100  100  100    1.0   0.0    1.0   0.0    0.04  227.8
   2   N   N  100  100  100    1.0   0.0    1.0   0.0    0.03  266.7
   0   N   N  200  200  200    1.0   0.0    1.0   0.0    0.22  290.9
   1   N   N  200  200  200    1.0   0.0    1.0   0.0    0.27  240.6
   2   N   N  200  200  200    1.0   0.0    1.0   0.0    0.23  278.3
   0   N   N  300  300  300    1.0   0.0    1.0   0.0    0.70  308.6
   1   N   N  300  300  300    1.0   0.0    1.0   0.0    0.91  237.4
   2   N   N  300  300  300    1.0   0.0    1.0   0.0    0.85  254.1
   0   N   N  400  400  400    1.0   0.0    1.0   0.0    1.69  303.0
   1   N   N  400  400  400    1.0   0.0    1.0   0.0    2.15  238.1
   2   N   N  400  400  400    1.0   0.0    1.0   0.0    1.85  276.8
   0   N   N  500  500  500    1.0   0.0    1.0   0.0    3.18  314.5
   1   N   N  500  500  500    1.0   0.0    1.0   0.0    4.25  235.3
   2   N   N  500  500  500    1.0   0.0    1.0   0.0    3.69  271.0
   0   N   N  600  600  600    1.0   0.0    1.0   0.0    5.49  314.8
   1   N   N  600  600  600    1.0   0.0    1.0   0.0    6.85  252.3
   2   N   N  600  600  600    1.0   0.0    1.0   0.0    6.29  274.7
   0   N   N  700  700  700    1.0   0.0    1.0   0.0    8.69  315.8
   1   N   N  700  700  700    1.0   0.0    1.0   0.0   11.95  229.6
   2   N   N  700  700  700    1.0   0.0    1.0   0.0    9.97  275.2
   0   N   N  800  800  800    1.0   0.0    1.0   0.0   13.11  312.4
   1   N   N  800  800  800    1.0   0.0    1.0   0.0   17.06  240.1
   2   N   N  800  800  800    1.0   0.0    1.0   0.0   14.88  275.3
   0   N   N  900  900  900    1.0   0.0    1.0   0.0   19.89  293.2
   1   N   N  900  900  900    1.0   0.0    1.0   0.0   24.13  241.7
   2   N   N  900  900  900    1.0   0.0    1.0   0.0   20.82  280.1
   0   N   N 1000 1000 1000    1.0   0.0    1.0   0.0   26.64  300.3
   1   N   N 1000 1000 1000    1.0   0.0    1.0   0.0   32.23  248.2
   2   N   N 1000 1000 1000    1.0   0.0    1.0   0.0   28.69  278.8