[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ATLAS developer release 3.1.1
Greetings, Clint! Thanks as always for your work on this.
About the dgemv, I can get 20% or so over the best atlas routine
(_mm.c) using prefetch. But when redefining prefetch to nop in my
code, the performance improvement from prefetch is more like 100%.
Which makes me think that the best strategy would be to take atlas' mm
code and just add a few prefetch commands where necessary.
Unfortunately, I haven't yet spent the time to know where the real
core routines are, there being so many includes and all. Can you give
me a pointer here?
R Clint Whaley <email@example.com> writes:
> I have posted a new developer release to the developer page,
> It is far from complete; I still have yet to get Camm's stuff in,
> or many of my own updates, for instance. However, it *does* have
> the new paper, and Goto's assembler ev5x/ev6 GEMM, as well as
> a "user-supplied GEMM" (i.e., the ability for the user to supply a
> full GEMM, rather than just a GEMM kernel) that simply calls ATLAS
> (as a building block for your own user-supplied GEMM).
> This release is 3 or so months overdue, but things keep intruding. I
> wanted to wait until I got Camm's stuff in, but with all the crap that
> keeps happening, I feared that might mean additional delays, so the
> strategy is going to be to release as soon as possible, every time
> I manage to get any significant progress done . . .
Camm Maguire firstname.lastname@example.org
"The earth is but one country, and mankind its citizens." -- Baha'u'llah