floating point exception and SSE2 instructions

Hi,

I'm building a little JIT that creates functions to do array manipulations,
eg. sum all the elements of a double* array. I'm writing this in python, generating
llvm assembly intructions and piping that through a call to ParseAssemblyString,
ExecutionEngine, etc.

It's working OK on integer values, but i'm getting nasty floating point exceptions
when i try this on double* values. I've seen this behaviour before on this platform
(debian Intel P4) when I tried using ATLAS with sse2. I'm pretty sure it's
valid assembly; the code still causes exceptions when i try using the output
from the llvm demo website. And it works fine on an AMD machine.

What is LLVM doing with my code ? Does it generate SSE2 instructions ?

thanks!

Simon.

double sum_d(double*mem,int n)
{
  double val=0.0;
  while(n)
  { val += *mem; mem++; n--; }
  return val;
}

Hi Simon,

The x86 backend does generate scalar SSE2 instructions. For your example, it should emit something like:

         .text
         .align 4
         .globl _sum_d
_sum_d:
         subl $12, %esp
         movl 20(%esp), %eax
         movl 16(%esp), %ecx
         cmpl $0, %eax
         jne LBB_sum_d_2 # cond_true.preheader
LBB_sum_d_1: # entry.bb9_crit_edge
         pxor %xmm0, %xmm0
         jmp LBB_sum_d_5 # bb9
LBB_sum_d_2: # cond_true.preheader
         pxor %xmm0, %xmm0
         xorl %edx, %edx
LBB_sum_d_3: # cond_true
         addsd (%ecx), %xmm0
         addl $8, %ecx
         incl %edx
         cmpl %eax, %edx
         jne LBB_sum_d_3 # cond_true
LBB_sum_d_4: # bb9.loopexit
LBB_sum_d_5: # bb9
         movsd %xmm0, (%esp)
         fldl (%esp)
         addl $12, %esp
         ret

There is nothing here that should cause an exception. Are you using a release or cvs?

Evan

Hi Simon,

The x86 backend does generate scalar SSE2 instructions. For your
example, it should emit something like:

Oh, how did you get this ?

[...]

There is nothing here that should cause an exception. Are you using a
release or cvs?

CVS.

From what I remember, this is a bug in debian libc:

some floating point flags are set incorrectly causing SIGFPE.
Can't find the bug report ATM.

Thanks,

Simon.

Oh, it just showed up on numpy-discussion:
http://sources.redhat.com/bugzilla/show_bug.cgi?id=10

"""
#include <fenv.h>
void feclearexcept(int ex)

This function should clear the specified exception status bits in the
FPU status register.
For CPUs with SSE support it should also clear the MXCSR status register
bits.

The problem is that feclearexcept() clears the status control bits also,
causing future floating-point errors to generate interrupts which will
lead to a SIGFPE signal which terminates the program (unless caught by a
SIGFPE handler).
"""

Is there a way I can disable SSE instruction generation in LLVM ?

Simon.

From what I remember, this is a bug in debian libc:

some floating point flags are set incorrectly causing SIGFPE.
Can't find the bug report ATM.

Oh, it just showed up on numpy-discussion:
http://sources.redhat.com/bugzilla/show_bug.cgi?id=10

"""
#include <fenv.h>
void feclearexcept(int ex)

This function should clear the specified exception status bits in the FPU status register. For CPUs with SSE support it should also clear the MXCSR status register bits.

The problem is that feclearexcept() clears the status control bits also,
causing future floating-point errors to generate interrupts which will
lead to a SIGFPE signal which terminates the program (unless caught by a
SIGFPE handler).
"""

I don't see what this has to do with anything, but...

Is there a way I can disable SSE instruction generation in LLVM ?

Yes. Pass -mattr=-sse1,-sse2,-sse3 to lli or llc.

If you've linked the JIT into your app, you can specify this by calling cl::ParseCommandLineOptions on an static array, something like:

int argc;
char *Args = { "", "-mattr=-sse1,-sse2,-sse3", 0 };
cl::ParseCommandLineOptions(argc, Args, 0);

-Chris

I don't see what this has to do with anything, but...

Me neither.

> Is there a way I can disable SSE instruction generation in LLVM ?

Yes. Pass -mattr=-sse1,-sse2,-sse3 to lli or llc.

Right, that fixed it.

BTW:

from the --help:
  -mattr=<a1,+a2,-a3,...> - Target specific attributes (-mattr=help for details)
  -mcpu=<cpu-name> - Target a specific cpu type (-mcpu=help for details)

but -mattr=help doesn't do anything.

Simon.

Annoyingly you have to specify a .bc file. Try:

% llvm-as /dev/null -o dummy.bc
% llc -mattr=help dummy.bc

-Chris