i32 vs i32 signext for C int parameters to library functions

I'm trying to fix a longstanding bug with library function calls on platforms where int is passed sign-extended in a register (it badly affects SystemZ, and has a chance of affecting PowerPC64, SPARC64, MIPS64 as well).

The problem is that, on these platforms, C int corresponds to a "i32 signext" parameter, but all LLVM passes inserting library calls currently use just "i32". This means that the high bits of the argument will be indeterminate instead of sign-extension.

In practice, PowerPC64, SPARC64, MIPS64 tend to get zero- or sign-extended data anyway (since all operations are performed on whole registers), and the bug is hidden. However, SystemZ has opcodes that actually modify only the low half of a register, so such functions tend to get junk in the high halves (eg. high bits of a heap address). As a practical example, the compiler-rt's profile testsuite (which is currently disabled on SystemZ) has multiple failures due to this problem.

In addition to int parameters, the same issue also affects unsigned int (which should be i32, i32 zeroext, or i32 signext depending on platform) and return values.

To fix this bug, I need some way to tell whether to use i32 or i32 signext in target-independent LLVM passes - but none seems to exist at this moment. I wrote a patch that adds this information to TTI ( https://reviews.llvm.org/D21739 ), but it doesn't fit with the current semantics of TTI - if the information about *ext is missing, we're going to have correctness problems instead of merely lowered performance.

How should I best resolve this problem?

The right answer here might be to put the information into TargetLibraryInfo. Fundamentally, you're trying to answer something along the lines of "What's the correct signature for a call to memchr?". This is the sort of question which TargetLibraryInfo already answers in other situations (for example, some standard libc functions have unusual names on some platforms). If TargetLibraryInfo can't come up with the correct calling convention for a given function for whatever reason, it can just claim the function is unavailable, which preserves correctness.

-Eli

From: "Eli via llvm-dev Friedman" <llvm-dev@lists.llvm.org>
To: "Marcin Koƛcielnicki" <koriakin@0x04.net>, llvm-dev@lists.llvm.org
Sent: Wednesday, September 7, 2016 12:35:50 PM
Subject: Re: [llvm-dev] i32 vs i32 signext for C int parameters to library functions

> To fix this bug, I need some way to tell whether to use i32 or i32
> signext in target-independent LLVM passes - but none seems to exist
> at
> this moment. I wrote a patch that adds this information to TTI (
> https://reviews.llvm.org/D21739 ), but it doesn't fit with the
> current
> semantics of TTI - if the information about *ext is missing, we're
> going to have correctness problems instead of merely lowered
> performance.

The right answer here might be to put the information into
TargetLibraryInfo. Fundamentally, you're trying to answer something
along the lines of "What's the correct signature for a call to
memchr?". This is the sort of question which TargetLibraryInfo
already
answers in other situations (for example, some standard libc
functions
have unusual names on some platforms). If TargetLibraryInfo can't
come
up with the correct calling convention for a given function for
whatever
reason, it can just claim the function is unavailable, which
preserves
correctness.

I agree; TLI seems like the right place for this information.

-Hal