[bug-fix] add_sat cannot handle type larger than i8 correctly.

Hi Peter,

Thank you for your work, my work benefit a lot from libclc. During my attempt to implement sub_sat, I found a bug in function add_sat. It cannot handle arguments larger than char correctly. Here is an example founded in builtins.link.bc:

define ptx_device i64 @_Z7add_satll(i64 %x, i64 %y) nounwind alwaysinline {
entry:
%call = tail call ptx_device i8 bitcast (i64 (i64, i64)* @__clc_add_sat_s64 to i8 (i64, i64)*)(i64 %x, i64 %y) nounwind ← return type of i64 is explicitly converted to i8
%conv = sext i8 %call to i64
ret i64 %conv
}

The bug stems from the declarations in add_sat.cl, where the return type of all add_sat function is declared as “char”. Here is a patch fixing this bug by simply changing the return type of each function declaration to the same type as their parameters. Patch for sub_sat will be sent in another mail. :slight_smile:

Regards,
Lei Mou

add_sat_cl.diff (1.06 KB)

Patch for sub_sat.

PS: I noticed that in the generated Makefile, file generic/lib/integer/add_sat.ll was not compiled (ptx/lib/integer/add_sat.ll was linked into builtin.bc). I’m not quite sure whether the file has some special purpose, or there is a bug in the build system. So I also added file sub_sat.ll in the same folder.

Regards
Lei Mou

sub_sat.diff (11.6 KB)

Hi Lei,

Sorry that this patch slipped through the cracks, but I've applied
it in r161311.

Thanks,

Patch for sub_sat.

Thanks, applied in r161312.

PS: I noticed that in the generated Makefile, file
generic/lib/integer/add_sat.ll was not compiled (ptx/lib/integer/add_sat.ll
was linked into builtin.bc). I'm not quite sure whether the file has some
special purpose, or there is a bug in the build system. So I also added
file sub_sat.ll in the same folder.

lib/integer/add_sat.ll is a hack to allow OpenCL C code to call
functions which use the default calling convention (the PTX backend
required every function to have either ptx_kernel or ptx_device
calling convention; I'm not sure about NVPTX but if it supports the
default then we should get rid of this hack). The generic version of
this file is currently unused because we do not support any target
other than NVPTX, but this may change in the future.

Thanks,