[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Athlon results



Guys,

The wheels are still turning to get out the 3.3.8 release.  I'm sending some
pre-release timings, just to give hope to the Athlon users out there.
Julian Ruhe has submitted an assembly-language kernel that improves ATLAS's
double precision Athlon performance by over 25%.  Just to whet your appetite,
I include some timings using his new kernel below.  I'm comparing my
development tree using his kernel (mislabeled as 3.3.8) against an old release
I had setting around on the machine, 3.3.2.  3.3.2 will have the same DGEMM
performance as the present release, 3.3.7.  My development tree adds no
performance wins over 3.3.7, so the whole difference you see is Julian's kernel.

The kernels are written in nasm assembly, and will be available in source form
for the curious in the next release.  This is why we can't just give you the
kernel to add to your 3.3.7 stuff: I had to add additional kernel support
for non-C contributions (our other assembly routines used gnu assembler, and
thus could be handled by gcc).

The numbers are for a 1.2Ghz Athlon (pre-Athlon4) with DDR memory.  The kernel
performs similarly for older systems (you get about the same % of peak on
my 600Mhz Athlon classic, roughly 920Mflop) . . .

And before someone asks, yes, this is getting the right answer as well :->

Cheers,
Clint

3.3.2: Old ATLAS release on same machine
3.3.8: My development tree + Julian's Athlon kernel

1.2Ghz Athlon (2.4Gflop peak):

             100    200    300    400    500    600    700    800    900   1000
          ====== ====== ====== ====== ====== ====== ====== ====== ====== ======

3.3.2 dMM 1136.4 1271.8 1388.6 1280.0 1315.8 1393.5 1372.0 1383.8 1429.4 1418.4
3.3.8 dMM 1315.8 1377.8 1567.7 1600.0 1666.7 1728.0 1759.0 1735.6 1778.0 1785.7

3.3.2 dLU  676.0  841.3  914.1  982.7  970.8 1027.3 1054.3 1065.7 1116.3 1110.3
3.3.8 dLU  698.6  917.8 1052.5 1064.7 1147.7 1232.7 1202.2 1263.0 1312.4 1359.5

            1200   1400   1600   1800   2000   2200   2400   2600   2800   3000
          ====== ====== ====== ====== ====== ====== ====== ====== ====== ======
3.3.8 dMM 1818.9 1799.3 1824.5 1842.7 1820.3 1823.3 1855.6 1832.7 1840.8 1848.7

3.3.8 dLU 1387.1 1406.4 1451.8 1472.1 1532.0 1536.0 1455.5 1446.2 1437.2 1473.8