[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Altivec matmul kernel (attachment)


Just back from travel, will take a while to catch up . . .

>Here's a question for the group:
>Altivec fp instructions execute in one of two modes:
>In "Java" mode, denormalized results are handled correctly, and 
>multiply-add instructions have a 5-cycle latency.
>In "non-Java" mode, denormalized results may not be handled correctly, 
>and multiply-add instructions have a 4-cycle latency.  All other 
>computations are IEEE compliant.  My matmul kernel gets about 150-200 
>Mflop speed bump (1650 to 1850, roughly) when going from Java mode to 
>non-Java mode.
>Should I let the user handle Java vs. non-Java mode, or should I turn 
>off Java mode explicitly?  (The submitted version doesn't touch the Java 
>mode bit).

I strongly believe that by default you should be IEEE compliant.  Lack of
compliance is why I don't furnish 3DNow! prebuilts, and I default to building
the athlon stuff not using it.  People using the numerical libraries really
need to be able to count on correct behavior . . .

So, I would like to see it default to "java-mode", with a special flag being
required to get the performance boost associated with the non-IEEE stuff.
I think the default kernel should explicitly turn on the IEEE compliance;
we can provide directions or setup help for getting one that doesn't have it.
A 10% drop in performance is withstandable to get the correct answer . . .