[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: sgemm questions
I have updated gemm kernels and new cleanup kernels for PIII and Athlon.
The performance Peter has been getting for the Athlon sounds
wonderful! Congratulations! I doubt if these Athlon kernels will
measure up, but I include them anyway. I only get about 2.0*Mhz for
the Athlon, which is better that the PIII mostly because of the pfacc
instruction, and the convenience of doing just one 64bit write for 2
elements from one register when using a 2x2xkb strategy.
The cleanups can probably be improved. Also could not check the
variable m and n, but should work. KB still compile time, but not
required to be a multiple of 4.
Camm Maguire email@example.com
"The earth is but one country, and mankind its citizens." -- Baha'u'llah