[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SSE Level 3 drop in gemm


>Now I have a different issue.  My kernel likes nb=56 the best.  Atlas
>standard likes nb=64.  And this is what I get in sMMRES:
>intech20:/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic$ cat res/sMMRES
>     0    2  64   5   1  64      0      5      1   371.46
>ATL_sgemm_SSE.c "CM"
>     1    1  64   2   2  64      0      4      1   617.61

The best solution is probably to add another case to scases.dsc.  I guess
you have a line that looks something like this right now:
0 1 1 1 1 1 2 2 64 ATL_sgemm_SSE.c "CM"
add a second line, with the explicit 56 blocking factor:
0 -56 -56 -56 1 1 2 2 64 ATL_sgemm_SSE.c "CM"

And that should force ATLAS time using your best blocking factor, which should
then be selected . . .  In the long run, I think it makes sense to have ATLAS
do this: if a user-contributed kernel is found to beat the generated kernel,
run through the list of possible NBs again, to find the best.  I'll look at
adding that while I'm doing the cleanup stuff . . .

>Also, any sugegstions on the unrolling issue?  If I put in macros to
>unroll k at differing levels depending on KB, would that confuse the
>search engine?  Should install faster than having 3 different k unroll
>kernels to time.

I don't think it would confuse the script, but you'd have to force the
different NB's at present via a technique like above.  Should work like
a charm if I add the NB search, though . . .