I have stumbled upon some different problems related to LLVM generating libcalls with the wrong function prototype lately.
Asking for some guidance how to tackle those problems.
One problem is related to certain builtins in compiler-rt that for example use "int" instead of machine mode types such as si_int. An example here is
double __powidf2(double, int);
If I understand https://reviews.llvm.org/D81285 correctly this was changed in compiler-rt to match the prototypes in libgcc that seem to be using non-machine mode types in a few places. However, for my target that has 16-bit int we run into problems since LegalizeDAG isn't taking the size of int into consideration when for example legalizing/lowering a call to
by replacing it with a call to
That unfortunately passes the linker without problem, but we end up with runtime errors since our rt lib expects the second argument to be i16.
I guess one solution here is to find all the builtins in compiler-rt that isn't using machine mode types and put them in a list of functions that depend on size of int/long etc. Then make sure we make those unavailable for targets with int!=si_int etc (and then one has to implement an replacement if those functions are needed).
Another solution could be to break compatibility with libgcc and use the machine mode types everywhere (or is there a requirement that builtins are compatible between libclang_rt and libgcc?).
Yet another solution is to make sure that LegalizeDAG knows about "size of int". And that it is provided with the list of bulitins that has "int" as argument or return type. To make sure correct prototypes are used when doing the libcalls.
The other problem I've seen is a bit similar. Here it is libc that sometimes use "int" in function prototypes.
An example here is
float ldexpf( float arg, int exp );
SimplifyLibCalls may rewrite powf(2.0, itofp(x)) to ldexpf(1.0, x), but SimplifyLibCall (or rather getIntToFPVal) does not take into consideration that the second argument to ldexpf may have a different size depending on the "size of int". In our case it used 32-bit argument on the caller side, while our libc expected a 16-bit value.
I think transforms that produce calls to libc is a slightly bigger problem than for the builtins. It is not only SimplifyLibCalls that assumes that an int is 32 bits (and that char is being 8 bits, etc). I think that for example ExpandMemCmp and MegeICmps is using assumptions that return value from memcmp/bcmp is an i32 (but it is an int which isn't necessarily an i32).
Here I have a downstream hack that adds some knowledge, such as SizeOfInt and SizeOfChar, in TargetLibraryInfo. And then I can use that info to make sure correct sizes are used in certain libcalls simplifications. Not sure if there is any alternative solution, otherwise I could prepare a patch in phabricator based on that.
One thing that could help here is if Module::getOrInsertFunction didn't bitcast the function pointer when the prototype isn't matching. Maybe there should be different flavors of getOrInsertFunction that either allows the bitcast or not. Otherwise I expect the compiler to complain that the types doesn't match. I actually don't know
when the bitcast would be useful (it is certainly bad to emit a call using the wrong prototype, but maybe it is ok when doing some kind of profiling/instrumentation?). But the code that adds a bitcast to the returned value from Module::getOrInsertFunction has been there since way back in time.
The sad thing with all of this is that in both cases these problems have been detected by users getting runtime failures. Had been nice if one could detect that LLVM has produced libcalls with the wrong prototype somehow. Right now I got a feeling that even though I hunt down and find solutions for problems I've observered, it is a time bomb because someone might add another builtin, or another transform from LLVM IR to a libcall, not playing by the rules tomorrow.