[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SSE Level 3 drop in gemm
Camm Maguire <email@example.com> writes:
> 2) I currently have a very small, but frustrating kludge in the
> kernel. For some reason, calling my assembler with the __asm__
> __inline__ (... :::"ax","bx",...); construct does not end up
> pushing the registers that fc.c is using, leading to a segfault
> unless I add an arbitrary "push %ebx\n\t"/"pop %ebx\n\t" pair
> around the kernel.
What's happening here is that this latest gcc is using ebx as a *fixed
register*! I thought from the docs that only esp was such, and ebp in
case of a frame pointer. Furthermore, there seems to be a gcc bug in
which asm(... :::"bx"); compiles successfully without warning, whereas
only asm(...::"b" (foo)); fails to compile with the fixed register
error. I've coded around this, but I have no way of knowing whether
this work-around is portable, as for all I know, a different register
might be fixed on some other gcc setup.
Any suggestions on how to handle this would be most appreciated.
Otherwise, the kernel is working fine. Performance fluctuates on the
short timer runs, but is somewhere between 670 and 700 MFLOPS for the
beta=0 case, and about 670 for arbitrary beta.
On another front -- Do you have any word on the complex compilation
procedure, Clint? The deal is that all beta cases seem to be
referenced by the same timer (fc.c) program, regardless of beta= flag.
> Camm Maguire firstname.lastname@example.org
> "The earth is but one country, and mankind its citizens." -- Baha'u'llah
Camm Maguire email@example.com
"The earth is but one country, and mankind its citizens." -- Baha'u'llah